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The Journal of Consulting Psychology will 
accept Brief Reports of research studies in 
clinical psychology for early publication with- 
out expense to the author. The procedure is 
intended to permit the publication of soundly 
designed studies of specialized interest or lim- 
ited importance which cannot now be ac- 
cepted because of lack of space. Several pages 
in each issue will be devoted to Brief Reports, 
published in the order of their receipt with- 
out respect to the dates of receipt of the regu- 
lar articles. Most Brief Reports appear in the 
first or second issue to go to press following 
their final acceptance. 


An author who wishes to submit a Brief 
Report: 

1. Sends the Brief Report, limited to one printed 
page and prepared according to the specifications 
given below. 

2. Also sends to the Editor a full report of the re- 
search study, in sufficient detail to give a clear ac- 
count of its background, procedure, results, and con- 
clusions, which will be filed with the American 
Documentation Institute to insure indefinite avail- 
ability. 

3. Prepares at least 100 mimeographed copies of 
the full report, which the author will send without 
charge to all who request it as long as the supply 
lasts. 

4. Agrees not to submit the full report to another 
journal of general circulation. 


Specifications 

Brief Report. The Brief Report should give 
a clear, condensed summary of the procedure 
of the study and as full an account of the re- 
sults as space permits. 

To insure that the Brief Report will be no 
longer than one printed page, its typescript, 
including all matter except the title and the 
author’s lines, must not exceed 75 lines av- 


Brief Reports 


eraging 42 characters and spaces in length. 
Set the typewriter margins for short lines of 
42 characters, which are 3.5 inches long in 
elite typing, and 4.2 inches long in pica. 

The manuscript of the Brief Report must 
be double spaced throughout. Except for its 
short lines, it follows the standard style (1). 
Headings, tables, and references are avoided 
or, if essential, must be counted in the 75 
lines. Each Brief Report must be accom- 
panied by a footnote in the style below, 
which is typed on a separate sheet and not 
counted in the 75-line quota: ' 

An extended report of this study may be ob- 
tained without charge from John Doe, 300 Market 
St., Prospect 6, Mass. (giving the author’s full name 
and address), or for a fee from the American Docu- 
mentation Institute. Order Document No. ——, re- 


mitting $—— for microfilm or $—— for photo- 
copies. 


Extended report. Because the extended re- 
port is intended for photoduplication, and is 
not copy to be sent to a printer, its style 
should differ in several ways from that of 
other manuscripts: (a) The extended report 
should be typed with single spacing for 
economy in duplication. (b) Tables and fig- 
ures should be placed adjacent to the text 
which refers to them. A caption should be 
typed below each figure. (c) Footnotes should 
be typed at the bottom of the page on which 
reference is made to them. In other respects, 
the full report is prepared in the style speci- 
fied by the Publication Manual (1). 
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1. American Psychological Association. Council of 
Editors. Publication manual of the American 
Psychological Association (1957 rev.). Wash- 

ington, D. C.: American Psychological Asso- 

ciation, 1957. 
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The Factorial Structure of the WAIS between Early 
Adulthood and Old Age’’ 


Jacob Cohen 


Franklin D. Roosevelt Veterans Administration Hospital 
and New York University 


The publication of the Wechsler Adult In- 
telligence Scale (WAIS) (8, 20) has, for the 
first time, made it possible to investigate the 
factor content of a battery of individually ad- 
ministered reliable intelligence subtests on a 
large well-selected adult standardization popu- 
lation over a wide age range. Study of these 
data should provide some insight into the 
aging process. Further, it makes possible an 
estimate of the factorial equivalence of the 
WAIS and its predecessor, the Wechsler-Belle- 
vue Intelligence Scale, which has been sub- 
jected to considerable factor analytic study 
(1, 2, 4, 5, 7, 11, 12, 13, 16). Finally, it yields 
a factor analytically based rationale for the 
measurement functions of the subtests on a 
normal population which supplements a simi- 
lar analysis of these subtest item types per- 
formed on the Wechsler-Bellevue for neuro- 
logical and psychiatric patients (5). 


Method 
Subjects 


The four age groups studied were as fol- 
lows: ages 18-19 (N = 200), ages 25-34 
(N = 300), ages 45-54 (N = 300) and ages 
60-over 75 (N = 352). The first three groups 


1From the Psychology Service, Veterans Adminis- 
tration Hospital, Montrose, New York. The author 
is in many ways obligated to Drs. Leon L. Rackow, 
Manager, George Rosenberg, Director of Professional 
Services, Oskar Diethelm, Chairman, Deans’ Com- 
mittee, and Seymour G. Klebanoff, Chief, Psychology 
Service, but mostly for creating the type of atmos- 
phere conducive to research. A special debt of grati- 
tude was incurred to Professor David Wechsler, who 
stimulated this investigation. 

2A paper based on this investigation was read at 
the meetings of the American Psychological Associa- 
tion in Chicago on September 5, 1956. 


283 


were part of the regular standardization sam- 
ple, a sample stratified by age, sex, geographic 
region, urban-rural residence, race, occupation, 
and education (20). The 60-75 age group was 
a supplementary standardization group ob- 
tained in the Kansas City area which was se- 
lected so as to make it representative of the 
over-60 group in that area (8). Of the 352 
cases in this group, 160 were men and 192 
women. 


Analysis 


For the younger three groups, the matrices 
of intercorrelations among the subtests given 
in the manual (20) were used. For the Kan- 
sas City sample, Doppelt and Wallace pro- 
vide separate matrices of intercorrelations for 
the four age groups (60-64, 65-69, 70-74, 
and 75 and over) which comprise the total 
sample of old persons (8, pp. 324-327). These 
four matrices were combined into a single 
matrix by averaging via Fisher’s z transfor- 
mation the 66 sets of four coefficients between 
the same subtests (4, pp. 133-134). 

The four matrices thus obtained were sepa- 
rately subjected to the following analysis: * 


1. Thurstone’s complete centroid method was used 
(18, pp. 161-170), with communalities estimated by 
his Equation 15 (18, pp. 300, 318). The solution was 
not reiterated. Three criteria were used to test for 
the completion of the extraction process, those of 
Saunders (3, pp. 300-301), McNemar (14), and Burt 
(6). In all four instances, Saunders’ criterion re- 
sulted in the extraction of five factors, while the 


3To save printing costs, tables giving centroid 
loadings and communalities, transformation matrices 
and intercorrelations among primaries have been de- 
posited with the American Documentation Institute. 
Order Document No. 5277, remitting $1.25 for micro- 
film or $1.25 for photocopies. 
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others accepted only four. Although initially only 
four were rotated, it was later discovered that a fifth 
factor had to be accepted in all but the oldest group, 
in order that the results would not seem inconsistent 
among the groups for reasons of insufficient factor 
extraction.* 

2. Rotation was done blindly and independently in 
the four groups, using Thurstone’s method of two- 
dimensional sections (18, pp. 194-216). The rotation 
was to a criterion of oblique simple structure and a 
positive manifold with an attempt to maximize the 
variables within a + .05 and + .10 hyperplane, as 
recommended by Cattell (3, pp. 235-236). 

3. The intercorrelations among the primary fac- 
tors were determined and subjected te a second-order 
general factor analysis (18, pp. 273-277, 421-434), 
extended, as detailed by Thomson (17, pp. 189-191), 
to obtain the correlations of the subtests with the 
second-order general factor and also with the (or- 
thogonal) primary factor specifics. The latter yields 
the independent contributions of the general and pri- 
mary factors to the total variance of each subtest. 
From this, the contribution by the general factor and 
each primary specific to the total test was determined. 


Results and Discussion 


Five factors resulted in all but the 60—over 
75 group, which proved deficient in Factor E.* 
Factor loadings in excess of .20 are accepted 
as significant ° and, to facilitate comparison, 
are followed by * when between .20 and .39 
and ** when .40 and higher. Since the factors 
which emerged from the rotation process were 
the same in all the groups, the tables are or- 


* The rotation of four centroid factors resulted in 
a fourth factor which had one identity in the ages 
18-19 and 45-54, and another in ages 25-34 and 
60-over 75. Upon extraction and rotation in of a 
fifth centroid for the three younger groups, this dis- 
crepancy was resolved. What had occurred was that 
the fourth and fifth centroids had emerged in differ- 
ent order among these three groups. Both the fifth 
and sixth centroids vanished upon rotation in the 
60-over 75 group, resulting in the deficiency of that 
group in Factor E. 

5 The use of the .20 significance criterion in place 
of the more conventional (and arbitrary) .30 is justi- 
fied as follows: This investigation can be looked upon 
as a set of four replications of a factor-analysis; 
therefore, consistency between groups is an effective 
guard against the acceptance of nonsignificant load- 
ings as significant, and superior to the rule-of-thumb 
30 criterion, which would lead to spuriously incon- 
sistent results. Further justification of this criterion 
is provided by the consideration that of the 228 fac- 
tor loadings which emerged, only eight were between 
10 and .20. Finally, the relatively large Ns employed 
here would alone justify relaxation of the usual 30 
criterion. 


Table 1 


Factor A Loadings of the Four Age Groups 
on the WAIS 














Age 

Subtest 18-19 25-34 45-54 60-754 
Inf. 30* 21* 36* 29* 
Comp. 33* 45** 27* 39* 
Arith. 09 01 06 04 
Sim. 23* 20* 32* 42°* 
D. Sp —10 —05 — 06 00 
Voc. 24* 48** 37* 37* 
D. Sym 08 04 01 — 06 
P. C. 07 03 Ot 07 
B. D. —03 —01 00 00 
P. A. 14 06 00 30* 
O. A. — 08 00 00 —01 


ganized by factor and show the loadings of all 
age groups a factor at a time. 


Factor A: Verbal Comprehension 


Factor A is clearly the familiar Verbal Com- 
prehension factor, loading in all groups In- 
formation, Comprehension, Similarities, and 
Vocabulary. This factor has emerged in all 
factor-analytic studies of the Wechsler-Belle- 
vue in normal groups at different age levels 
(1, 2, 7, 11, 12, 13, 16) as well as in psychi- 
atric patients (4, 5, 13, 16). There is little 
doubt that it is the same factor found in all 
general investigations in the intellectual do- 
main (cf. 17). 

The sole lack of uniformity in the signifi- 
cant loadings on Factor A is that of Picture 
Arrangement in the oldest group. It leads to 
the inference that at advanced ages at least 
some subjects depend upon “telling the story” 
to themselves in order to achieve solutions, 
and vary among themselves in the ability to 
de so successfully. It may also be related to 
the already noted deficiency of this group in 
Factor E, which, in the next youngest group 
(45—54), does carry a loading on this test 
(Table 4). 


Factor B: Perceptual Organization 


The two tests loading this factor consist- 
ently in all four age groups are Block Design 
and Object Assembly. Present in all but the 
45—54 age group, with just barely acceptable 
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loadings, is Picture Arrangement. A significant 
loading for Digit Symbol appears only for the 
oldest group. 

This factor has been consistently found in 
factor analyses of the Wechsler-Bellevue, al- 
though it has variously been identified as per- 
formance (1, 13, 16), spatial-perceptual (12), 
closure (2), and nonverbal organization (4). 
Slight differences in the subtests which have 
loaded this factor are attributable to differ- 
ences in rotational criteria (1) and insuffi- 
ciency of factor extraction due to relatively 
small Ns (2, 13, 16). The common content 
of these interpretations is the non-verbal, 
perceptual, organizational characteristics. Be- 
cause of the loadings of Picture Arrangement 
and Digit Symbol, this factor resists inter- 
pretation as a spatial factor. Clearly, however, 
both speed and quality of perceptual perform- 
ance and organization are involved; hence, it 
is here named Perceptual Organization. It 
very likely represents a few highly correlated 
factors such as Thurstone’s Perceptual Speed, 
Closure, and Spatial Relations (19), which do 
not emerge separately because of the paucity 
of tests in this subdomain of the intellectual 
sphere. With reference tests included in the 
matrix, Davis (7) achieved such a further 
fractionation of this factor in the Wechsler- 
Bellevue. 

No hypothesis is offered to account for the 
failure of Picture Arrangement to load signifi- 
cantly in the 45—54 group. The appearance of 
Digit Symbol in the 60—over 75 group is note- 


Table 2 


Factor B Loadings of the Four Age Groups 
on the WAIS 


Age 


Subtest 18-19 25-34 45-54 0-754 
Inf. 00 —06 0s —09 
Comp. 08 03 — 04 01 
Arith. 10 00 02 06 
Sim. —07 —10 06 10 
D. Sp. —09 03 00 09 
Voc. —09 7 -0G8 +02 
D. Sym. 14 09 04 29" 
P.C, 09 02 0S 04 
B.D. 34° 30° 35° 58°° 
P.A. 24° 22" —03 20° 
O.A 


w” oe “or S6** 
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Table 3 


Factor C Loadings of the Four Age Groups 
on the WAIS 


Age 
Subtest 18-19 25-34 45-54 60-754 
Inf 04 10 10 4i°° 
Comp 09 10 O8 38° 
Arith 32° 32° 33° 5a°" 
Sim Os 02 10 10 
D. Sp 28° 24° 35° 4s** 
Voc. 09 02 03 | ama 
D. Sym O8 02 09 21° 
wh 01 06 02 01 
B.D 12 07 01 09 
P.A 10 09 ()2 09 
O.A 05 01 Oo 03 


worthy, since it parallels a finding in a previ 
ous investigation of neuropsychiatric groups 
with the Wechsler-Bellevue (4, 5). There it 
was found that although Digit Symbol did not 
load Factor B significantly in psychoneurotic 
and schizophrenic groups, it loaded quite 
heavily in the brain-damaged group. This was 
interpreted as due to the fact that “...a 
greater part of the variance of this test in the 
brain-damaged is associated with visual or 
ganization and simple speed than is the case 
with the non-brain-damaged” (5, p. 276) 
This interpretation may well apply here, the 
operative factor being the brain damage which 
occurs (differentially) with senescence, and 
which results in enough perceptual speed and 
discrimination variance for it to load Factor B. 


Factor C: Memory 


For the youngest three groups, Arithmetic 
and Digit Span are the only two subtests load 
ing Factor C. This factor, too, has appeared in 
most factorizations of the Wechsler-Bellevue, 
and although its test composition was not al- 
ways identical, it has always been character 
ized by high loadings on Arithmetic and Digit 
Span (1, 2, 4, 5, 13, 16). It has variously 
been interpreted as memory (1, 2), freedom 
from distractibility (4, 5), attention-concen- 
tration (13), and concentration-speed (16) 
These interpretations are not as diverse as 
they may seem at first glance; it is not un- 
reasonable to suppose that effective memory 
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Table 4 
Loadings of Factors D and E of the Four Age Groups on the WAIS 





























Factor D Factor E 
Age Age 
Subtest 18-19 25-34 45-54 60-75+ 18-19 25-34 45-54 
Inf. 04 15 —08 10 01 02 —03 
Comp —02 11 09 —10 —08 —07 —10 
Arith. —05 06 00 08 00 —09 —04 
Sim. 21* 15 02 —02 00 17 06 
D. Sp. 08 —05 02 03 26* 10 06 
Voc. 16 —09 —03 —09 07 02 10 
D. Sym — 08 01 00 20* 24* 25* 31* 
Ad 29* 34* 21* 38* — 06 06 00 
B.D. 03 04 —03 00 00 00 02 
P. A. 06 —03 22* 10 04 04 22* 
O. A. 04 — 06 04 03 08 00 — 06 


ability, particularly in a test situation, is de- 
pendent upon the ability to attend during 
both the reception and reproduction phases 
of the learning-remembering process. Because 
memory is the more inclusive concept, and 
Thurstone (19) found Digit Span to load only 
his Memory factor, and, particularly, in the 
light of the functioning of the 60-over 75 
group discussed below, Factor C is here in- 
terpreted as Memory. 

A most striking phenomenon occurs in the 
Factor C loadings for the 60-over 75 group. 
In addition to high loadings in Arithmetic and 
Digit Span, there are also loadings, some quite 
high, in Vocabulary, Information, Compre- 
hension, and Digit Symbol. The addition of 
these tests does not require a change in the 
interpretation of C as a memory factor; if 
anything, it substantiates it. The hypothesis 
is offered that with senescence, some individu- 
als begin to deteriorate, or to state it more 
accurately, individual differences in rate of 
deterioration occur. This deterioration is re- 
flected in difficulties in the memory function 
which becomes an important source of indi- 
vidual differences (hence, score variance), and 
is more widely influential in its effect on test 
performance, invading the verbal subtests and 
Digit Symbol. Obviously, to respond success- 
fully to Vocabulary or Information items at 
any age requires that the information be re- 
membered, but until old age is reached, the 


variance in memory ability involved in these 
tests is inconsequential. With old age and dif- 
ferential rates of deterioration, scores on these 
verbal tests come to depend as much or more 
on memory ability as they do on verbal com- 
prehension ability. 


Factors D and E 


Factor D loads Picture Completion consist- 
ently in all’ four groups, alone (25-34), and 
variously with Similarities (18-19), Picture 
Arrangement (45-54), and Digit Symbol 
(60-over 75). These diverse loadings, being 
perfectly inconsistent, are not trustworthy for 
use in the interpretation of the factor. Factor 
D, then, is a quasi-specific which is not inter- 
preted other than calling it a Picture Com- 
pletion factor. 

A similar state of affairs exists for Factor E. 
It loads Digit Symbol consistently in the three 
groups in which it appears, alone in the 25-34 
group, with Digit Span (18-19), and with 
Picture Arrangement (45-54). Again, this fac- 
tor is a quasi-specific which is left uninter- 
preted other than identifying it as a Digit 
Symbol factor. Its deficiency in the 60—over 
75 group is not surprising. In this group, Digit 
Symbol gives up its communal variance in 
atypical loadings on Factors B, C, and D. 
Apparently, the specific ability demanded by 
Digit Symbol at younger ages ceases to be 
important in senescence, and three other fac- 








se 


tors which had previously not affected Digit 
Symbol begin to do so. 

With regard to both Factors D and E, it 
should be noted that although they may be 
encroached upon by error, the consistency 
with which they load Picture Completion and 
Digit Symbol, respectively, makes them un- 
questionably real phenomena. They are minor 
factors which give rise to small portions of 
the total communal variance, and, being spe- 
cifics, do not help in the understanding of 
what the subtests measure, except negatively. 
They do have some importance, however, in 
bringing the results of this study into line 
with those of factor-analytic studies of the 
Wechsler-Bellevue. The discrepancies have, on 
the whole, been minor, and are partly at- 
tributable to different rotational criteria (or- 
thogonal vs. oblique) and insufficient rotation. 
The present results suggest that these re- 
seachers failed to extract enough factors. 
They were justified in so doing by their 
relatively small Ns, and the risk of accepting 
as real a factor which represents only sam- 
pling error. In the present investigation, the 
fact that four large groups were studied inde- 
pendently and with blind rotations gives much 
confidence in the reality of the results. Previ- 
ous studies, in extracting only three factors 
(the typical number), had the Picture Com- 
pletion test loading the B factor (12, 13, 16), 
or both the A and B factors (2, 4, 5), and the 
Digit Symbol subtest loading the C factor (4, 
5, 12), the B factor (4, 5), or both the B and 
C factors (2, 16). These two tests have been 
the ones showing by far the greatest incon- 
sistencies among studies, the reason for which, 
it is suggested here, is simply the consequence 
of the technical problem of insufficient factor 
extraction rather than of true differences in 
factor make-up. 


The Second-Order Analysis 


The 36 intercorrelations among the primary 
factors for the four groups (see footnote 3) 
ranged from .43 to .89, with a median of .70, 
the middle 18 falling between .61 and .80, in- 
clusive. This gave rise to a strong general sec- 
ond-order factor. Table 5 gives the correlation 
of the subtests and the primary factors with 
G, which is interpreted as present general in- 
tellectual ability. The G correlations of the 
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subtests are quite high (median .70), and of 
the primaries even higher (median = .85). 
Given these magnitudes, there can be little 
question that the WAIS Full Scale IQ is 
loaded very strongly with G for adults, cer- 
tainly up to the age where deterioration be- 
gins to set in. 

As has been found in the past (4, 5), the 
“essentially verbal” subtests, i.e., those load- 
ing the A factor, particularly Vocabulary and 
Information, are the best measures of G over 
most of the adult age range, and Object As- 
sembly and Digit Span the poorest. As do 
most generalizations about our data, this 
breaks down partly for the 60—-over 75 group. 
Not only does the G variance fall off in this 
group (see below), but the Factor A tests are 
no longer clearly the best measures of G. In 
particular, the correlation with G of Vocabu- 
lary, both absolutely and relative to those of 
the other subtests, falls off sharply, and pre- 
sents yet another factor-analytic validation of 
a consequence of the Babcock hypothesis (4, 
5). That is, Vocabulary does not measure 
present general intellectual ability in the aged 
as well as it does in younger normal groups, 
or as well as some other tests do. The hy- 


Table 5 


Correlations with G of WAIS Subtests and Primary 
Factors for the Four Age Groups 











Age 

Subtest 18-19 25-34 45-54 60-75+ 
Inf. 83 84 82 73 
Comp 69 71 76 60 
Arith. 68 71 74 63 
Sim. 81 75 75 65 
D. Sp. 63 59 65 48 
Voc. 86 79 83 66 
D. Sym 66 64 65 71 
P.C. 76 72 77 75 
B.D 71 71 6&9 65 
PA 66 69 74 65 
0.A 65 59 68 58 
Primary 
Factor 

A 84 86 90 76 

B 75 79 88 74 

Cc 81 92 89 62 

D 88 85 94 92 

E 79 86 78 —_ 











Table 6 


Per Cent Contribution to the Total Variance of G and 
Each Factor Specific for the Four Age 
Groups on the WAIS 





Orthogonal Factor 








_—_— Sum 
GC VF VO CO UD! UD % 

Age Group 
18-19 52.7 45 60 2.3 2.1 1.7 69.3 
25-34 $0.0 52 53 17 2.5 1.3 66.0 
45-54 543 42 28 23 09 1.6 66.1 
705 — 64.7 


60-75+ 421 59 61 9. 
pothesized reason for this is that Vocabulary, 
in contrast with other tests, tends to resist 
the general effect of age deterioration, and 
consequently does not measure G as well (5). 

The second-order analyses included, for 
each group, the determination of the correla- 
tion of each subtest with that part of each 
primary factor which is mot correlated with 
G, that is, the independeni (orthogonal) con- 
tribution of each primary-specific (17, pp. 
189-191). This makes possible the determi- 
nation of the per cent contribution of each 
specific factor to the total variance, as well 
as that of G (see Table 6). 

From Table 6, it is apparent that about 
two-thirds of the total variance of the eleven 
subtests of the WAIS is shared among them. 
The general factor is responsible for the shar- 
ing of about one-half of the total variance or 
about three-quarters of the communal vari- 
ance (at least for the younger three groups), 
and is by far the most important of the 
sources of correlation among the subtests. 

Although the above generalization holds for 
the oldest group, it does so to a lesser degree 
and this difference is instructive. While the 
contribution of G to the total variances for 
the younger three groups averages 52.3%, for 
the 60-over 75 group it averages only 42.1%. 
Thus, the general factor is materially less in- 
fluential in producing correlation for older per- 
sons, that is, a differentiation of intellectual 
functioning occurs with advanced age. 

Since the above occurs with no great reduc- 
tion in the communal variance, the comple- 
mentary finding is that the independent con- 
tribution of the primary factors is greater in 
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the aged group. This is true for the three ma- 
jor factors, but most strikingly for the Mem- 
ory factor. Whereas the contribution of Fac- 
tor C to the total variance for the three 
younger groups averages 2.1%, for the 60— 
over 75 group it is 9.7%. Thus, it appears 
that with advanced age, and its attendant dif- 
ferential rates of deterioration, general ability 
plays a lesser role, and Memory a greater role 
in WAIS performance. 


Subtest Measurement Functions 


Implicit in the previous material is a psy- 
chometric rationale for the measurement func- 
tions of the subtests which will be made ex- 
plicit in a later article. One aspect of this is 
worth noting here, namely, the inherent limi- 
tation in the ability of the WAIS subtests 
to measure distinct psychological functions. 
Table 7 presents for the three younger groups 
the specificities of the subtests, i.e., the per- 
centage of the total subtest variance which is 
neither shared with the other subtests nor 
error, valid variance measured by only that 
subtest. It was found by subtracting from the 
subtest reliabilities at each age (20) their as- 
sociated communalities as found in the analy- 
sis. The specificities for the 60—-over 75 group 
are omitted due to the unavailability of reli- 
ability coefficients. The values for Digit Sym- 
bol are enclosed in parentheses because its re- 
liability was obtained for a young group of 
nursing students not strictly comparable with 
the standardization samples. 


Table 7 


Subtest Specificities for the WAIS for 
Three Age Groups 














Age 
Subtest 18-19 25-34 45-54 
Inf. 10 10 09 
Comp. iS 08 08 
Arith. 19 22 17 
Sim. 13 20 18 
D. Sp. 19 24 16 
Voc. 10 04 10 
D. Sym (34) (43) (34) 
P.C. 11 16 17 
B. D. 14 19 18 
P.A. 10 02 12 
O.A 00 06 10 
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As can be seen in Table 7, the values are 
quite small with a median of 14%. Thus, on 
the average, only one-seventh of the subtests’ 
variance is not attributable to common fac- 
tors and error. Under these circumstances, the 
attribution of specific measurement functions 
to the subtests as has been done by such cli- 
nicians as Rapaport (15) in connection with 
the Wechsler-Bellevue, is clearly unjustified, 
as has already been noted (5). 

Further, these small specificities account for 
the essentially disappointing results of the re- 
search efforts of the past ten years in pat- 
tern analysis with the Wechsler-Bellevue. The 
specificity of the Wechsler-Bellevue subtests 
is even smaller than those of the WAIS be- 
cause of the lower subtest reliabilities. Since 
subtest score differences are largely reflective 
of relatively small specificity variances and 
relatively large cumulated error, it is not sur- 
prising that these research efforts have con- 
stituted a futile quest for a philosopher’s 
stone. 


Age and Intellectual Organization 


A review of Tables 1 through 5 reveals a 
remarkable degree of similarity among the 
youngest three groups with regard to intel- 
lectual organization. In the case of the ma- 
jor factors, with only one minor exception,® 
wherever a factor loading exceeds .20 in any 
one group, it does so in both of the others. 
The evidence seems impressive that the or- 
ganization of intellectual functioning (on the 
WAIS, at least) is essentially invariant be- 
tween the ages of 18 and 54. 

This generalization cannot be extended to 
the 60—-over 75 group. All three of the ma- 
jor factors undergo an increase in their inde- 
pendent variance contributions (Table 6). Al- 
though Factors A and B continue to load the 
subtests which they loaded on the younger 
groups, Factor A picks up Picture Arrange- 
ment, and Factor B Digit Symbol. The most 
striking change occurs in the Factor C. The 
Memory factor, as already noted, spreads 
over several new tests and comes to be inde- 


® The exception is the failure of Picture Arrange- 
ment to load Factor B in the 45-54 group. Since the 
loading on the other two groups is weak (.24 and 
.22), this deviation is probably within sampling error 
expectations. 
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pendently responsible for almost 10% of the 
total variance, with a concomitant reduction 
in the amount of G variance. Thus, a rea! 
change occurs in intellectual organization in 
the elderly, with memory playing a far more 
important role in determining individual dif- 
ferences in test performance. 

The present analysis casts some doubt on 
accepted conception of intellectual organiza- 
tion with regard to the lower end of the de- 
velopmental scale. Garrett (10) presents a 
“differentiation” hypothesis to the effect that, 
“Over the elementary school years we find a 
functional generality among tests at the sym- 
bol level. Later on this general factor or ‘g’ 
breaks down into the quasi-independent fac 
tors reported by many investigators (10, p. 
376).” The evidence that he adduces for this 
view is largely the reduction in correlation 
among tests and factors in college students as 
compared to school children. On its face, this 
evidence is ambiguous since college students 
fall in the upper tail of the intelligence dis- 
tribution, and the patent selection for intelli- 
gence involved would lead to the expectation 
that test correlations would decline. 

In the present investigation, we find that at 
college age (18—19) as well as later, for un- 
selected samples of the population, the general 
factor accounts for a substantial amount of 
the total variance of the WAIS subtests, that 
is, about half, which in turn is about three- 
fourths of the communal variance. This does 
not leave much room for greater generality at 
younger ages. Thus, the present findings, al- 
though incomplete, cast serious doubt on the 
notion that the general factor “differentiates” 
into quasi-independent factors by the late 
teens. Whatever “differentiation” occurs does 
so around retirement age, and then “deteriora- 
tion” is probably a better descriptive term. 

As age increases over our samples, mean 
education decreases (18), and the question 
arises as to whether our results may not be 
due to education. This is considered unlikely 
on two grounds. Firstly, although education 
drops steadily between 25—34 and our oldest 
group, there is no drop in G variance between 
25-34 and 45—54; in fact, a small increase 
occurs, and between 45-54 and 60—over 75, 
a large decrease occurs (see Table 6). One 
would need to postulate a curvilinear relation- 
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ship between education and amount of G vari- 
ance, and although this is of course possible, 
it would be difficult to rationalize psychologi- 
cally. Secondly, the sharp increase in Memory 
variance which occurs in the oldest group is 
also not interpretable in educational terms, 
and is reasonably accounted for in the light 
of what is known about the effects of gener- 
alized brain damage upon memory function. 
It seems both more parsimonious and more in 
keeping with present knowledge to account for 
the findings on the basis of age rather than 
education. 


Summary and Conclusions 


The WAIS standardization data for four 
age groups (18-19, 25~34, 45-54, and 60- 
over 75) were separately factor-analyzed us- 
ing complete centroid extraction, blind oblique 
rotation to simple structure and a positive 
manifold, and a second-order analysis into a 
general factor and orthogonal primary-spe- 
cifics. The following conclusions were drawn: 

1. Three major correlated factors, Verbal 
Comprehension, Perceptual Organization, and 
Memory, are found with striking consistency 
over the age range studied. These are the same 
factors as have previously been reported for 
the Wechsler-Bellevue. In addition, two spe- 
cific minor factors, each loading a single test, 
and hence uninterpretable, were also found. 

2. A strong general factor, accounting for 
approximately half the total variance of the 
subtests, was found to be operating in the 
groups studied. This evidence is contrary to 
Garrett's “differentiation hypothesis,” which 
suggests a sharp reduction in the importance 
of the general factor by the late teens. 

3. Only one exception to factorial invari- 
ance over age occurs. In the 60—-over 75 group, 
the Memory factor undergoes a sharp increase 
in variance at the cost of the general factor. 
Thus, in aged subjects, intellectual perform- 
ance, even on verbal tests, becomes dependent 
to a noteworthy degree on memory ability. 

.4. The relatively small degree of subtest 
specificity makes suspect any rationale which 
attributes quasi-unique measurement functions 
to the subtests, and helps account for the dis- 
appointing outcome of ten years of research in 
pattern analysis with the Wechsler-Bellevue. 


Received November 5, 1956. 
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Psychological Changes over a Five Year Period 
Following Bilateral Prefrontal Lobotomy’ 


Isidor W. Scherer, C. James Klett 
VA Hospital, Northampton, Massachusetts 
and John F. Winne 


Washington, D. 


In previous publications, Scherer e¢ al. (1, 
2) have reported changes in a standard bat- 
tery of psychological tests administered at 
fixed time intervals over a period of three 
years to a group of lobotomized schizophrenics 
and their equated controls. The present re- 
port extends these observations to the fifth 
year after lobotomy and attempts to arrive at 
some summary statements to describe the out- 
come of five years of lobotomy research. 


Experimental Procedure 
Subjects 


Fifty white male schizophrenic patients at 
the Veterans Administration Hospital, North- 
ampton, Massachusetts have served as sub- 
jects (Ss) during some phase of this five-year 
study. Of these patients, 28 were lobotomized, 
the operation consisting of a standard open 
Lyerly-Poppen approach (2, p. 6) while the 
remaining 22 constituted the control group. 

The Ss for the fifth year of testing consisted 
of 13 lobotomized and 12 nonlobotomized pa- 
tients, fairly well equated for age, education, 
length of hospitalization, diagnostic classifica- 
tion, and degree of cooperativeness. Prior to 
the experiment all patients had received elec- 
troshock treatment, which was unsuccessful or 
only temporarily beneficial, and all had been 
selected as suitable candidates for lobotomy 
by the medical staff of the hospital. 

Although all these Ss were included in the 
previous studies, the groups at the different 
time intervals cannot be considered completely 


1From the Veterans Administration Hospital, 
Northampton, Massachusetts. 
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equivalent because of the attrition due to dis 
charge and other reasons. Presumably the 
most intact patients were the ones who were 
ultimately discharged. 


Test Material 


The battery utilized in the fifth year of 
testing consisted of 21 tests that yielded the 
40 “clear-cut measures of functional effi 
ciency” reported in the three-year study (1) 
A description of these tests and the measures 
derived from them can be found elsewhere 


(2). 


Results and Discussion 


A summary of the changes on the 40 meas 
ures of functioning efficiency for five testing 
periods appears in Table 1. Included are the 
direction of change within the experimental 
and control groups and the net change in the 
experimental group over the control group 
All recorded changes are statistically signifi- 
cant at the .20 level or better, consistent with 
practice in the earlier studies. 

Examination of the five-year columns of 
Table 1 shows that the experimental group 
has gained on 16 measures and lost on only 
2 when compared with testing prior to lo- 
botomy. During the same period, the control 
group gained on 11 measures and showed a 
loss on 2. The net change column shows that 
at five years the experimental group has made 
significantly more gains over its preoperative 
baseline than the control group on 10 meas- 
ures with significantly less gain on 3. Of the 
10 measures showing net gain for the experi- 


Table 1 


lsidor W. Scherer, C. James Klett, and John F. Winne 


Direction of Change from Preoperative Baseline in Forty Tests of Functioning Efficiency 
Over a Five-Year Period 


Measure 


Digit Symbol 
Errors (1) 

Score (2) 

Digit Span 
Forward (3) 

Reversed (4) 
Weighted score (5) 

Serial Sevens 
Time (6) 

Errors (7) 

Hard Pairs (9) 

Visual memory rights (10) 

Vocabulary (17) 

Memory Paragraph 
Immediate rights (18) 
Delay rights (21) 

Average rights (24) 
Immediate distortions (19) 
Delayed distortions (22) 
Average distortions (25) 
Immediate distorted order (20) 
Delayed distorted order (23) 

Finger Dexterity Time (29) 

Tweezer Dexterity Time (31) 

Downey Total Time (36) 

Halstead 
Right hand (38) 

Left hand (39) 
Both hands (40) 
Shapes (43) 
Position (44) 

Object Sorting 
Score (48) 

Confabulation (49) 
Symbolism (50) 

Similarities (51) 

Series Completion (53) 

Shipley-Hartford (55) 

Categorization (57) 

Trailmaking 
Time (58) 

Errors (59) 

D-A-W Confusion (62) 

D.A.M Accuracy (65) 
Confusion (66) 

Bender Gestalt Elaboration (75) 

Word Association Recall (81) 

No. of Measures Showing 
Gain 
No change 
Loss 


Experimental Group 


2 
wk 


mod | 


lem) | 


an 


31 
4 
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mo 
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12 
25 
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i 
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Control Group 
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8 
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Net Change 


3 
yr 
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H 
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lm] I 


10 
27 
3 





Note.—The number in parentheses refers to the measure as defined in the one-year report (2, pp. 6-10). H indicates increased, 
decreased, — indicates no change in functioning efficiency. These figures indicate the 20 per cent level of confidence 
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or better. 
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mental group, 7 of them—2, 6, 10, 17, 21, 24, 
and 53—represent a significant improvement 
in the experimental group over their preopera- 
tive level with either a concomitant loss or no 
gain in efficiency on the part of the control 
group. The other three measures reflect a sig- 
nificant net improvement on the part of the 
operated patients.” 

These findings at the fifth year are further 
clarified by an inspection of the changes from 
the third to the fifth year after lobotomy. Al- 
though the earlier papers showed that the ex- 
perimental group had continued to improve 
over its preoperative level and over the con- 
trol group up to the third year, both groups 
have essentially stabilized since that time. At 
five years, the experimental group achieved a 
higher Shipley-Hartford score than it had at 
three years. During the same period, the con- 
trol group committed fewer errors on the 
Serial Sevens test and introduced fewer dis- 
tortions in the immediate recall of the Mem- 
ory Paragraph but demonstrated more con- 
fusion on the Draw-a-Woman test. In terms 
of net change, the experimental group showed 
significantly less D-A-W confusion but signifi- 
cantly more Serial Sevens errors than the con- 
trol group when compared to their respective 
three-year levels. None of the remaining 114 
comparisons reached the .05 level of con- 
fidence. 

To summarize briefly the changes over a 
five-year period that are shown in Table 1, 
both the experimental and the control groups 
have shown gains at every testing period over 
their preoperative level. In the interval from 
one to three years, the experimental group 
continued to show greater gain than the con- 
trol group, but after that time the groups 
seem to have stabilized. The experimental 
group, however, maintained its greater net 
gain to the fifth year. Only at the two-week 
testing period, when the postoperative effects 
could be expected to be most acute, did the 


2 Tables giving, for each measure, the number of 
cases, means, the standard error of difference, t, and 
p for the changes within the experimental and con- 
trol groups, and for the net change in the operative 
group during the period from before operation to 
five years after operation have been deposited with 
the American Documentation Institute. Order Docu- 
ment No. 5180, remitting $1.25 for microfilm or 
$1.25 for photocopies. 
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experimental group prove to be inferior to the 
control group in terms of net gain. These find- 
ings have some implication for those who fear 
the negative after-effects of lobotomy. It ap- 
pears that either the frontal lobes are not sig- 
nificantly related to the mental functions 
studied in this report or that lobotomy does 
not interfere significantly with the function 
of the frontal lobes. 

Assessing the positive effects of lobotomy 
is a somewhat more difficult task. At the end 
of one year it was concluded (2) that: (a) 
there was a tendency towards decreased men 
tal efficiency, possibly associated with organic 
change; (b) ego boundaries were strengthened 
at least to the third month when there were 
some indications of weakened ego boundaries: 
(c) sexual awareness and differentiation were 
increased; and (d) there was an increase in 
the rate of motoric action. Additional changes 
suggested a tendency toward lack of inhibi- 
tion in the moral-social field. At the end of 
three years it was concluded (2) that there 
was evidence of (a) increased mental effi- 
ciency; (0) strengthened ego boundaries: 
(c) continued increase in sexual awareness: 
(d) increased rate of motoric action; and 
(e) more inhibition or less impulsivity on 
tests of imagination and ideation. 

Considering the changes in the psychologi- 
cal test measures at the fifth year, it might 
be concluded that in a global sense the lo- 
botomized patients demonstrated an increase 
in functioning efficiency. However, attempting 
to account for this overall improvement in 
terms of specific ego functions or in terms of 
specific test measures leads to ambiguity of 
interpretation since there is little consistency 
in the particular measures on which it is mani- 
fested. It is also possible that the individual 
personality structure reacts to surgical trauma 
in such an individualized manner that it over- 
whelms the potential specific effects of the 
lobotomy operation. In this context, it is im- 
portant to note Winne and Scherer’s report 
(3) of a second one-year study. They found 
that of 211 specific predictions derived from 
the first one-year study, only 15 were con- 
firmed in the second study. Furthermore, over- 
all improvement of the experimental group 
was not noted in their study. 

In order to retain the more stable of the 
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findings shown in Table 1 and to eliminate 
those most susceptible to chance effect, a more 
rigorous level of confidence was imposed upon 
the data than had been used in the earlier ex- 
ploratory studies. Those measures yielding net 
change differences at the .0S and the .01 level 
for the five testing periods are identified in 
Table 2. 

It is clear that the operation had some ef- 
fect upon psychological test performance and 
this effect is in a positive direction with the 
exception of the two immediately postopera- 
tive testings. However, it does not seem to 
be fruitful or justifiable to attempt to account 
for the superiority of the experimental group 
in terms of specific ego functions presum- 
ably measured by the test battery, particu- 
larly since the observed test changes could be 
a function of more generalized personality 
changes such as increased docility, coopera- 
tiveness and passivity. 


Table 2 


Direction of Net Change from Preoperative Baseline in 
Various Tests of Functioning Efficiency 
Over a Five-Year Period 


Direction of Net Change 

















Measure 2wk 3mo nes 3yr Syr 
Digit Symbol Score (2) —- — — H H 
Digit Span Forward (3) Lh. beeigess 4 
Weighted score (5) - L- - = 
. Serial Sevens’ Time (6) —- — L— Hi 
Hard Pairs (9) — — H* — Hi 
Visual Memory Right (10) — — — H* H 
Memory Paragraph 
Immediate right (19) —- —- KH- = 
Delay right (21) — —- H HH — 
Average right (24) —- — H HH — 
Delayed distortions (22) — -—- — H* — 
Finger Dexterity Time (29) H* — —- —- — 
Halstead 
Right hand (38) LLe=-lhUcorhUcO 
Both hands (40) e- lcererlUc CU 
Categorization (57) L-—- —- H — 
D-A-M 
Accuracy (65) —- KE- - = 
Confusion (66) —- H H H = 
No. of measures showing 
Gain 1 2 5 7 4 
No change so 8h MH 8S FS 
Loss 4 3 1 0 1 





y —r; at .01 level. All other measures indicated are 
at .0S level of significance. ee Se ee 
pet awe Riles co Bow cay een ee t reached the .0S 





In the three-year report it was suggested 
that in spite of the favorable showing by the 
operated group on the psychological tests, it 
was not doing as well clinically, in some re- 
spects, as the nonoperated group. This con- 
clusion was drawn from qualitative analysis 
of the patients’ behavior and from examina- 
tion of psychiatrists’ progress notes. In order 
to determine whether the increased efficiency 
shown by the lobotomy patients on the psy- 
chological tests at the fifth year was reflected 
in clinical improvement, some additional in- 
formation was collected that was thought to 
serve as a rough indication of clinical change. 
Of the 22 lobotomy patients used in the one- 
year study, 13 are still in the hospital, seven 
have been discharged, one was transferred to 
another hospital, and one died. Of the 22 
matched controls, 18 are still inpatients, and 
four have been discharged. This difference is 
not a significant one.* 

When a comparison was made between the 
experimental and control subjects that are still 
in the hospital in terms of the number of days 
they have spent out of the hospital on trial 
visits, leave of absence and for other reasons, 
it was found that the control group had actu- 
ally spent a greater mean number of days out- 
side of the hospital although the difference 
was not a significant one. 


Summary 


The present paper completes a program of 
lobotomy research based upon periodic psy- 
chological testing over a five-year postopera- 
tive period. Forty measures of functioning effi- 
ciency at the fifth year are presented together 
with similar measures obtained at two weeks, 
three months, one year, and three years post- 
operatively. The results of the latter testing 
periods have been discussed in detail in previ- 


® For those readers interested in disposition data 
on the entire series of 104 patients lobotomized at 
this hospital between October 23, 1947 and April 26, 
1950, 71 are still in this hospital, four died, and four 
were transferred to other Veterans Hospitals. Twelve 
of the 71 patients still in the hospital were discharged 
on at least one occasion but were then readmitted. 
Of the 25 remaining patients now out of the hos- 
pital, four were readmitted to this hospital at least 
once and it is, of course, unknown how many have 
been admitted to some other hospitai. 
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ous publications. The conclusions are as fol- 
lows: 

1. Although both the control group and the 
experimental group continued to show gains 
on the measures of functioning efficiency up 
to the third year, both groups stabilized be- 
tween the third and the fifth year. There was 
essentially no net change between the third 
and the fifth year. 

2. The lobotomy group was generally su- 
perior to its preoperative level and to the con- 
trol group after five years, i.¢., it was able to 
maintain its gains. 

3. No attempt was made to attribute posi- 
tive gains to specific ego functions because of 
inconsistency of individual measures from one 
testing period to another and because of 
Winne and Scherer’s negative findings in the 
second one-year study. 
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4. Indicators of clinical improvement such 
as discharge rate do not reflect the positive 
gain shown on psychological tests. 


Received October 9, 1956. 
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The Question of Deterioration in Alcoholism 


Robert W. Bauer’ and Derwood E. Johnson 
Evansville State Hospital 


Intellectual impairment is presumably a 
consequence of chronic alcoholism. Yet the 
research evidence is equivocal. Most study has 
concerned the depression of function during 
intoxication. Two recent studies which em- 
ployed Form I of the Wechsler-Bellevue In- 
telligence Scale suggested patterns of subtest 
deviation, but were not in agreement. Kaidegg 
(1) concluded that his alcoholic group was 
more similar to a psychoneurotic group than 
to any other. Tumarkin (2) found lowered 
digit span and lowered digit symbol perform- 
ance associated with diffuse cerebral atrophy. 

The present report resulted from a com- 
parison of 34 chronic alcoholic patients with 
an equal number of emotionally disturbed pa- 
tients without alcoholism and without evi- 
dence of neurological disturbance. The Wech- 
sler Adult Intelligence Scale, and Raven 
Progressive Matrices (1938) were used. Age 
ranged from 17 to 53 with no significant dif- 
ference between groups. 

Neither the findings of Tumarkin nor Kal- 
degg were confirmed by statistical analysis of 
the WAIS subtest patterns. Overall chi square 
and analysis of variance indicated no signifi- 
cant difference between groups in subtest pat- 
terns. Variance among the subtests for both 
groups taken together was significant (? less 
than .001). 

The Wechsler Memory Scale did differenti- 


1An extended report of this study may be ob- 
tained without charge from Robert W. Bauer, Ph.D., 
Evansville State Hospital, Evansville, Indiana, or for 
a fee from the American Documentation Institute. 
Order Document No. 5275, remitting $1.25 for micro- 
film or $1.25 for photocopies 


ate the two groups significantly (p less than 
.02). But this overall chi square was largely 
due to the superior retention of personal and 
current information in the alcoholic group. 
Tremor, which was found by Kaldegg in 
Bender-Gestalt drawings of 17 out of 18 
alcoholic patients, was noted on Wechsler 
Memory reproductions in this study. Marked 
tremor, indicated by obvious deviations in all 
major lines of the drawings, occurred more in 
the alcoholic group (chi square, p less than 
02). 


The Raven Progressive Matrices (1938) 
indicated no significant difference between 
groups. 


Most striking was the similarity between 
groups in performance patterns in all three 
tests indicated graphically and statistically. 
No pattern of neurological deficit was inter- 
preted from the results. Kaldegg’s conclusion 
regarding a resemblance between his alcoholic 
and psychoneurotic groups was reinforced by 
the similarity found in this study between 
a group of chronic alcoholic patients and a 
group including various neurotic and psy- 
chotic diagnoses. 


Brief Report. 
Received March 22, 1957. 
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A Further Note on Chlorpromazine: Maze Reactions 









S. D. Porteus and John E. Barclay 
Territorial Hospital, Kaneohe, T. Hawaii 


The purpose of the present article is to pre- 
sent a progress report on current research re- 
garding the effect of prolonged routine dosage 
of chlorpromazine ' (300 m.g. daily for pe- 
riods from 6 weeks to 6 months) on Maze 
tested abilities of psychotic patients. Previous 
results were reported in the February issue of 
this Journal (2). 


Method 


The performance of an experimental group 
of 35 cases has been compared with that of 
25 control patients who have never had the 
drug. The original or standard Maze Test was 
applied before medication, the extension or 
practice-free Maze Test after at least 6 weeks’ 
medication. The two forms of the test were 
given to the control group of untreated hos- 
pital patients with a varying interval of time 
between testings. Inspection of the data did 
not reveal any influence of elapsed time on 
the extension score. 

Difficulty in setting up controls. The well- 
known inconsistency of behavior of many psy- 
chotics may conceivably interfere with the 
comparability of two test scores obtained at 
different times. Because the Maze undoubt- 
edly reflects temperamental factors, this in- 
consistency does affect the setting up of con- 
trols. 

Investigators in the first two Columbia- 
Greystone projects attempted to overcome the 
difficulty by using only results “considered by 
the examiner to be representative efforts of a 
cooperative patient” (1, p. 184). Their sub- 
jects were already partially stabilized. We 
could not select patients in this way because 


1 Drugs and placebos used in this study were con- 
tributed by the Smith, Kline, and French Labora- 
tories, Philadelphia. 
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of the impossibility of determining what was 
a “representative” performance. 

Although anxiety, suspicion, restlessness, or 
aggressiveness could be expected to lower 
Maze performance, especially among the con 
trol group, the experimental group had al- 
ready been “tranquilized,” making any de- 
cline in their scores more significant (2). 


Comprative Results 


The previous report was based on the as- 
sumption that there should be, for psychotic 
patients, the same equivalence between stand- 
ard and extension Maze scores as was evinced 
by normals (3). The use of a control group 
was intended to substantiate this assumption 

In the experimental group (N = 35) the 
deficits in average Maze score after chlor- 
promazine amounted to 1.89 years, whereas 
the difference for the control group (NV = 25) 
between the standard and extension means 
was only — 0.1 year, thus proving that the 


Table 1 


Maze Scores of Experimental and Control Group: 


Measure Experimenta] Control 

N 35 25 
Standard Maze 

Mean 11.90 11.96 

SD 3.42 3.34 
Extension Maze 

Mean 10.01 11.86 

SD 3.58 3.42 
Difference — 1.89 —0.10 
t 2.26" 
F 4% 
z* 2.2i° 





* Significant at .05 level 
* Mann-Whitney test 
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decline in Maze score after chlorpromazine 
was not due to the greater difficulty of the 
extension series. The original or standard 
score of the experimental group was 11.9, of 
th: control group 11.96. The results of the 
comparisons are set forth in Table 1. 

Another approach was to compare the scores 
of 20 pairs of control and experimental cases, 
matched exactly by original standard scores. 
The decline for the chlorpromazine patients 
was even more striking. Table 2 shows that 
they lost 2.2 years as against a gain of 0.2 
year for the controls when the extension se- 
ries was applied to each group. Reference to 
the tables will show that in the experimental 
group the critical ratio of the mean difference, 
F ratio, and # ratio obtained by simple analy- 
sis of variance, and also use of the Mann- 
Whitney (zs) test, a nonparametric distribu- 
tion-free test, the results were all significant 
beyond the .05 level. By any of these tests 
the mean differences in the control group 
were totally insignificant. 

A third comparison was possible for 21 
cases to whom the extension series was ap- 
plied a second time, the tests being inverted 
to diminish practice effects. Table 3 shows 
that if the drug is continued for a longer pe- 
riod of administration, the gap between con- 


Table 2 


Maze Scores of Matched Experimental and 
Control Cases 











Measure Experimental Control 

N 20 20 
Standard Maze 

Mean 11.85 11.85 

SD 3.50 3.50 
Extension Maze 

Mean 9.65 12.05 

SD 3.20 3.60 
Difference —2.20 0.20 
t 2.10* .20 





* Significant at .05 level. 


Table 3 
Maze Scores—Repeated Applications 








Measure Experimental Control 

N 21 21 
Standard Maze 

Mean 12.00 12.02 

SD 3.00 3.24 
Extension Maze 1 

Mean 10.50 11.98 

SD 3.89 3.29 
Difference —1.50 —0.04 
Extension Maze 2* 

Mean 9.91 13.09 

SD 3.61 2.95 
Difference —2.09 1.07 
t 1.92* 1.17 

© Significant at 05 level. mY 

* Extension series inverted. 


trols and experimental patients widens rather 
than decreases. A fourth application of the 
Mazes to cases still under the drug showed 
scores still well below the original ones, 
whereas the control group was very close to 
the ceiling of the test. 

The results as yet throw no light on the 
crucial question as to whether the effects of 
the drug are transitory or permanent. This 
much may be said. Since in human affairs no 
great gains can be made without cost, it is 
likely that any treatment drastic enough to 
change a mentally disturbed patient into a 
cooperative, unworried individual must be 
otherwise paid for, and it may well be that 
deficits in initiative, planning, and foresight 
are part of the price. What has proved to be 
the case in iobotomy may well be true for the 
ataractic drugs. As far as present results go, 
the parallel between chlorpromazine and psy- 
chosurgery seems clear. Notwithstanding Maze 
deficits, patients treated with chlorpromazine 
are obviously less self-concerned, less anxious, 
and less aggressive, and therefore better ad- 
justed at a simple social level. This fact must 
affect claims for the Maze as an index of gen- 
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eral social adaptability. It may well be that 
adaptability is of two kinds, one in which 
freedom from inner tensions makes the indi- 
vidual easier to live with, and another in 
which planning and initiative make him more 
industrially capable. The latter may be what 
the test measures. A question to be answered 
by further research is whether the benefits as 
well as any deficits that follow the use of 
ataractic drugs are transitory or permanent. 


Summary 


Repeated administrations of the Maze Test 
to patients receiving chlorpromazine, in com- 
parison with a control group, reveal a con- 
tinued deficit of about two years. The decline 
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in maze scores is comparable to that shown 
by patients who have undergone lobotomy. 
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Measures of Conformity as Found in the Rosenzweig 
P-F Study and the Edwards Personal 
Preference Schedule’ 


George N. Graine* 


Alfred University 


Experience with Rosenzweig’s Group Con- 
formity Rating (GCR) of his P-F study 
raised the question if the same pattern of re- 
sponses as Rosenzweig gives as the basis of 
his measure of conformity would appear in a 
sample of college students. His interpretation 
of GCR as a measure of social adjustment in 
the direction of conventionality, and inclusion 
of “conformity” in the title, raised the ques- 
tion if the GCR and another measure of con- 
formity would agree. The Edwards Personal 
Preference Schedule (EPPS) autonomy scale 
was selected as a measure of the opposite of 
conformity. 

The two instruments were administered to 
a group of 83 college students (40 male and 
43 female); these groups were similar in age 
to the latest standardization groups given for 
both measures. 

The proportion of total responses to each 
P-F cartoon revealed two major discrepancies 
from the GCR criterion items. In situations 
11 and 22, the subjects emphasized scoring 
categories different from or even opposite to 
those given in the manual as the expected cri- 
terion categories. For two other cartoons, the 
proportion selecting the criterion items was so 


1An extended report of this study may be ob- 
tained without charge from the author’s advisor, Jo- 
seph L. Norton, Alfred University Graduate School, 
Box 844, Alfred, New York, or for a fee from the 
American Documentation Institute. Order Document 
No. 5274, remitting $1.75 for microfilm or $2.50 for 
photocopies. 
2 Now in military service. 





low as to question if they warrant their use as 
“popular” categories. Three other cartoons not 
given as GCR criteria received sufficiently fre- 
quent responses that they might well be con- 
sidered for inclusion in the GCR. 

The product-moment correlations between 
GCR and “autonomy” for the total group, for 
males and for females, were .24, .12, and .26. 
Such positive correlations do not support the 
contrast between conformity and autonomy. 

The norm group of the GCR does not re- 
flect a sex difference, nor did these subjects 
show a significant difference between males 
and females. On the other hand, the EPPS 
does reflect a difference between males and 
females; both the norm group and these sub- 
jects show distinct sex differences in scores. 
This sex difference—no sex difference contrast 
also shows that the two instruments are not 
measuring opposite dimensions. 

This study revealed a failure to obtain, in 
this sample, the same emphasis on some of the 
scoring categories as is expected by Rosen- 
zweig as shown by his criterion items for the 
Group Conformity Rating score. Also, the 
GCR yielded positive correlations with a 
measure of nonconformity, indicating a need 
for further investigation of these instruments. 
A contrast in scores for the sexes was found 
on one instrument but not on the other, fur- 
ther indicating that the two instruments do 
not appear to measure opposite variables. 


Brief Report. 
Received March 18, 1957. 
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A Multidimensional Comparison of Therapist Activity 
in Analytic and Client-centered Therapy’ 


Hans H. Strupp 


The George Washington University, School of Medicine 


“In science we need flexible minds and rigid 
concepts, but in psychoanalysis we have rigid 
minds and flexible concepts” (13, p. 233). 
This statement by a leading analyst, which 
is equally applicable to other forms of psy- 
chotherapy, epitomizes a growing awareness 
among research-minded psychotherapists that 
the fluidity of concepts, the ambiguities of 
language, and the idiosyncratic frames of ref- 
erence espoused by competing schools repre- 
sent serious barriers against furthering our 
knowledge of the psychotherapeutic process. 
From numerous quarters in recent years has 
come the cry for simpler concepts, for opera- 
tional definitions, and for identifying the com- 
mon denominators underlying all psychothera- 
peutic procedures. This trend implies, among 
other things, that differences in theory are 
meaningless if they fail to carry over into 
practice, and that focus upon the actual opera- 
tions may be more fruitful for testing theo- 
retical differences than prolonged controversy 
about the uniqueness of a given system. 

The analysis of therapeutic protocols has 
occupied the time of researchers for some 
years, but rarely has an attempt been made 
to go outside a school of thought and to com- 


’ This research is part of a larger project which is 
supported by a research grant (M-965) from the Na- 
tional Institute of Mental Health, of the National In- 
stitutes of Health, U. S. Public Health Service. Grate- 
ful acknowledgment is made to Winfred Overholser, 
M.D., under whose general direction this work was 
carried out, and to Leon Yochelson, M.D., project 
consultant. In addition, I am greatly indebted to my 
former research associate, Rebecca E. Rieger, A.M., 
who contributed materially to the execution of this 
study. A slightly different version of this paper was 
presented at the 1956 Annual Meeting of the Ameri- 
can Psychological Association in Chicago. 
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pare the techniques of, say, a nondirectivist 
with those of an analyst. Yet, such compari- 
sons will inevitably play a part in future at 
tempts to evaluate the relative effectiveness 
of competing approaches to psychotherapy. 

This paper presents a preliminary descriptive 
analysis of two varieties of psychotherapeutic 
techniques: insight therapy with reeducative 
goals based on psychoanalytic principles, and 
client-centered therapy. The analysis is medi 
ated by a multidimensional system, designed 
to quantify the common denominators in the 
verbal operations of therapists irrespective of 
their theoretical orientation. The data obvi- 
ously do not permit an evaluation of the re- 
spective merits of short-term analytic and 
client-centered therapy. 

The Two Case Histories 

The first case history, published by Wol 
berg (12, pp. 688-780) ,* comprises nine treat 
ment sessions with a retired business woman, 
a widow in the middle years of life, who had 
become~progressively depressed, and retreated 
from her customary social contacts. Concern 
ing his technique, the therapist (Wolberg) 
mentions that the work proceeded almost en 
tirely on a characterologic level, and that the 
effect of treatment was mostly of a reeduca 
tive nature, despite the fact that he inter 
preted some of the patient’s defenses. A fol 
low-up indicated that the results of treatment 
had been curable. 

The second case history is that of Mary 
Jane Tiden, counseled by Rogers in a series 
of eleveh interviews (9, pp. 128-203). Un- 
fortunately, the author was not aware that 


2 This case, particularly the therapist’s activity, has 
been more fully discussed (11) 
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this case is available in its entirety, which 
necessitated the selection of reasonably com- 
plete interviews from the beginning, middle, 
and terminal phases of treatment from the 
published portions. 

Miss Tilden was described as a 20-year-old, 
attractive young woman brought to the clinic 
by her mother, who complained that the pa- 
tient was sleeping all the time, brooding, and 
ruminating. Miss Tilden seemed to be with- 
drawing progressively—she had given up her 
job and lost interest in her social life. Miss 
Tilden was treated by nondirective therapy. 
Rogers felt that the eleven counseling hours 
were followed by a period of improved adjust- 
ment; nevertheless, the evaluation of final 
outcome remained somewhat in doubt since, 
shortly after a year had elapsed, there seemed 
to be a recurrence of the earlier symptoma- 
tology.® 


The System of Analysis 


The system of analysis whose development 
and operational characteristics have been de- 
lineated in another publication (10), yields 
five measures relative to any therapist com- 
munication. There are two sets of categories 
(Type of Therapeutic Activity and Dynamic 
Focus), and three intensity scales (Degree of 
Inference, Initiative, and Therapeutic Cli- 
mate). These components may be briefly 
characterized as follows: 


Type of Therapeutic Activity. The categories speci- 
fied the outer form or structure of a therapeutic in- 
tervention and provide a gross analysis of the tnera- 
pist’s techniques. The major categories were: 


00 Facilitating Communication (Minimal activ- 
ity). 
10 Exploratory Operations. 
20 Clarification (Minimal interpretation). 
30 Interpretive Operations. 
40 Structuring. 
50 Direct Guidance. 
60 Activity not clearly relevant to the task of 
therapy. 
70 Unclassifiable. 
Sixteen subcategories served to refine the primary 
rating. 
Degree of Inference. This intensity scale was based 
on the concer‘ion that inference is an integral part 


% Although one cannot be sure, this case may per- 
tain to that period in the evolution of client-centered 
therapy in which Rogers (7) detects “vestiges of 
subtle directiveness.” 
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of all therapeutic communications and that it is al- 
ways present to some degree. Each communication 
was rated by means of a five-point scale ranging 
from low to high inference. Scale points were de- 
fined a priori rather than via empirical judgments, 
but examples of typical communications were used 
to define each scale point. 

Dynamic Focus. Dynamic Focus referred to the 
frame of reference adopted by the therapist at a 
particular juncture, and characterizes the manner in 
which he focuses the therapeutic spotlight. Two ma- 
jor sectors were used to differentiate whether the 
therapist “goes along” with the patient (A) or 
whether he introduces a different focus (B). Com- 
munications assigned to Sector B were further ana- 
lyzed in terms of five subcategories: 


B-1 Requests for additional information. 

B-2 Focus on dynamic events in the past. 

B-3 Focus on dynamic events in the present. 

B-4T Focus on dynamics of the therapist-patient 
relationship (analysis of the transference). 

B-4 Focus on the therapist-patient interaction 
in terms of the therapist’s role as an ex- 
pert, authority, etc. 


Initiative. The second intensity scale measured the 
extent to which the therapist assumes responsibility 
for guiding the patient’s communications in a given 
channel. Initiative was conceived as ranging from 
low to high, and ratings were made on a four-point 
continuum. As in the case of Degree of Inference, 
scale points were defined by reference to appropriate 
examples. 

Therapeutic Climate. Emotional overtones discer- 
nible in a communication were quantified by means 
of a bipolar scale: 0 = neutral; + 1 = mild degree of 
warmth; + 2 strong degree of warmth; — 1 mild de- 
gree of coldness; —2 strong degree of coldness. A 
“warm” communication is one in which the therapist 
empathizes, shows understanding, or supports; a 
“cold” communication is one in which the therapist 
rejects, withdraws support, or punishes. 


Procedure 


Seven of the nine Wolberg interviews and 
three representative interviews from the Miss 
Tilden case were scored jointly by two raters 
from the printed scripts. Two of the Wolberg 
interviews were rated independently by the 
same raters to obtain a measure of rater agree- 
ment. 


Results 
Rater Agreement 


Table 1 presents results based on a unit- 
by-unit analysis of two interviews scored in- 
dependently by two raters. Agreement on 2 
unit (therapist communication) means that 
both raters assigned it to the same category 














SS 


ory 


Tr: REI ae 


RAR Saks 





— =——_— VS 


2 
t 
y 


. oh ieee Bale te 


Metis 6 es OS Bo 


Te Bey | 





Table 1 
Agreement Between Two Independent Raters* 











Wolberg Wolberg 
Interview VII Interview IX 
System Component (N = 114) (N = 154) 
Type 80.7% 80.5% 
Degree of Inference 86.0% (r=.86) 94.0% (r=.885) 
Dynamic Focus 80.7% 85.7% 
Initiative 87.7% (r= .87) 93.5% (r= .93) 
Therapeutic Climate’ -—— _ 





* All percentages and correlation coefficients are significant 
beyond the .01 level. 
> Nonzero scores too infrequent. 


(on Type and Focus, respectively), or that 
they gave it an intensity score (on Degree of 
Inference or Initiative, respectively) no more 
than one-half step apart. For the last two 
scales, product-moment coefficients were com- 
puted in addition. 


The Wolberg Case 


The therapist’s activity, as mirrored by the 
multidimensional system of analysis, is pre- 
sented in Figures 1, 2, 3, and 4. Within each 
interview, frequencies have been converted 
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into percentages. In the case of Degree of 
Inference and Initiative, the designation Level 
1, 2, and 3 signifies that scores have been 
grouped; Level 3 refers to the most intense 
scores. Chi squares computed for each com- 
ponent of the system were significant beyond 
the .01 level, indicating that the fluctuations 
in therapist activity for the interview series 
are not attributable to chance. 

The therapist’s techniques show systematic 
variations on all components over the course 
of therapy.* The initial interview is devoted 
largely to an exploration of the patient’s prob- 
lem; the next two interviews reveal an intensi- 
fication of therapeutic activity, both in terms 
of inferential operations and /nitiative ; Inter- 
views IV and VII emerge as interpretive ones, 
the intervening sessions as less “dramatic’’; 
data for the remaining sessions point to a 
phasing out of interpretive activity, but /nitia- 
tive remains at a relatively high level. 

The therapist’s interpretations are geared to 
the patient’s current interpersonal relations, 
with relatively little emphasis on the thera- 


* Therapeutic Climate had to be omitted because 
there were very few nonzero scores. 
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Fig. 1. Therapist activity in the Wolberg case in terms of Type of Therapeutic Activity. (Interviews: 
I, N=108; Il, N=79; Il, N= 108; IV, N=174; V, N=123; VI, N=85; VII, N=114; VIII, N= 


130; IX, NW = 154. Total number of therapist interventions: NV = 1,075.) 
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pist-patient relationship or on genetic ante- 
cedents. Throughout treatment, but especially 
in the second half, the therapist stands out as 
a person who, in the role of an expert, gives 
guidance, states opinions, and engages in pro- 
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INTERVIEWS 
Fig. 3. Therapist activity in the Wolberg case in terms of Dynamic Focus. 


INTERVIEWS 


Fig. 2. Therapist activity in the Wolberg case in terms of Degree of Inference. 


cedures which may be labeled reeducative. 
He is clearly more active than passive, both 
in terms of frequency of intervention and in 
directing the course of therapy. Wolberg’s 
own descriptive label “insight therapy with 
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Fig. 4. Therapist activity in the Wolberg case in terms of /nitiative 


reeducative goals” appears to be corroborated 
by the quantitative analyses. 

The most noteworthy single result is per- 
haps the phasing of therapeutic activity. It 
seems as if the therapist gradually prepares 
the patient for more inferential formulations 
which he advances in the fourth session. Then 
he waits for a consolidation of insight before 
renewing his interpretive efforts in Interview 
VII. Thereafter, he diminishes his interpretive 
activity while maintaining a degree of thera- 
peutic pressure till the end. 
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Fig. 5. Therapist activity in the Miss Tilden case 
in terms of Type of Therapeutic Activity. (Inter- 
views: I, N=57; V, N=23; XI, N=S3. Total 
number of therapist interventions: N = 133.) 


The Case of Miss Tilden 


The analysis comprises three selected inter- 
views; they are, however, separated in time 
and they presumably represent different stages 
of therapy. 

Reference to Figures 5, 6, 7, and 8 indi- 
cates that the profiles of therapist activity are 
quite similar from interview to interview. As 
might be expected, reflections of feeling ac- 
count for a large percentage of all interven- 
tions (75%); interpretations are virtually 
absent; explorations are used minimally in 
the initial session and are almost nonexistent 
later on; direct guidance is equally rare. The 
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Fig. 6. Therapist activity in the Miss Tilden case in 
terms of Degree of Inference. 
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data on Degree of Inference and Initiative 
corroborate these findings: neither maximal 
Degree of Inference nor maximal /nitiative is 
used to any appreciable degree, but the initial 
interview is relatively more inferentia] than 
the final one. (In this instance, chi square ex- 
ceeded the .01 level of probability; all others 
failed to reach the .05 level.) In most of his 
interventions, the therapist accepts the pa- 
tient’s focus; only very rarely does he assume 
the role of an expert or an authority. 


Intertherapist Comparisons 


While the preceding analyses have shown 
that Wolberg’s technique varies systematically 
over the course of treatment whereas Rogers’ 
does not, the question may still be asked, how 
do the two therapists compare at different 
stages of therapy? To explore this problem, 
three interviews from the beginning, middle, 
and terminal phases of the Wolberg series 
were selected and compared with the Miss 
Tilden case. Since the distributions of the 








100 


90 K-S : - a2 





N 2-4 






































Se: 
SOS OF 
80 a es eee 
Bid = sss] pees 
70 [Pte ed 
Os eee, *, 9,94 
pose) obo ir 
oa 60 $284 ates ees 
a So a OOS 3 
& 50 4s. eet}-—{s$s 
o ese +5 s333 A 
~ oc? peo" 
“e ores Ses] 
40 at ot $234 

















30 ° 
eee th Pos 
ee a 
sh SY te 
10 b—229 : 
wi 





























INTERVIEWS 


Fig. 7. Therapist activity in the Miss Tilden case in 
terms of Dynamic Focus. 
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Fig. 8. Therapist activity in the Miss Tilden case in 
terms of Initiative. 


categories within Type and Dynamic Focus 
vary so greatly for the two therapists, the 
only meaningful comparisons concern the con- 
tinua of Degree of Inference and Initiative. 
The results of this analysis are presented in 
Table 2. 

In the case of Degree of Inference, a sig- 
nificant chi square indicates that Wolberg’s 
technique is significantly more inferential than 
Rogers’; with respect to Initiative, Wolberg 
exerts stronger guidance in the middle and 
terminal interviews, but not in the initial one. 
The latter finding is accounted for by the fact 
that Wolberg employs a great many explora- 
tory questions of a diagnostic character in his 
first session, which in terms of /nitiative re- 
ceive scores similar to the reflection-of-feeling 
technique, which Rogers employs throughout. 


Table 2 


Chi-Square Comparisons of Therapist Activity in 
Initial, Middle, and Terminal Interviews 








Wolberg I Wolberg IV Wolberg IX 


(N=108) (N=174) (N=154) 
oS. vs. 25. 

RogersI RogersV Rogers XI 

(N=57) (N=23) (N = 53) 








Degree of Inference 19.32*** 
Initiative 19 


9.39"* 
22.79°** 


4.66" 
9.8S** 





* Significant between the .02 and .05 level. 
** Significant at the .01 level. 
*** Significant at the .001 level. 
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Discussion 


A multidimensional system of analysis has 
been applied to the therapist’s communica- 
tions in two forms of therapy in an effort to 
measure aspects which may be common to 
both. With respect to the Miss Tilden case, 
the system of analysis yields data which are 
substantially in agreement with other analy- 
ses which have been performed on interviews 
conducted by nondirective counselors. By and 
large, these results also agree with Rogers’ 
recommendations on therapeutic technique. 
Wolberg’s technique, too, is in agreement with 
his descriptive account but, to my knowledge, 
no comparable quantitative studies have been 
published. While not crucial, such evidence 
attests indirectly to the validity of this sys- 
tem of analysis. Of at least equal importance, 
however, is the tentative demonstration that 
the method facilitates the comparative treat- 
ment of therapeutic techniques—a treatment 
which is quantitative and highly objective, 
and which does not prejudge a particular com- 
munication as desirable or undesirable on a 
priori grounds. 

To be sure, the present two case histories 
are comparable only in superficial respects 
and they do not lend themselves to a rigorous 
evaluative comparison. However, they suggest 
a number of questions which appear to be 
basic to all psychotherapy research. Consider 
the following two points. 

We know that both patients entered psy- 
chotherapy seeking alleviation of their emo- 
tional problems. Did their difficulties have any 
common basis? What was the relative degree 
of their disturbance? Even if both had been 
diagnosed as “depressed,” or by any other 
label, we would know but little about the 
common denominators of the underlying dy- 
namics. As Kubie (6) has pointed out, the 
time is ripe for fresh attempts to identify the 
common principles of the “neurotic process.” 
It is clear that studies in which patients are 
matched with experimental “controls” remain 
largely meaningless unless this Herculean re- 
search task can be accomplished. 

Secondly, what transpired in the thera- 
peutic sessions that led both therapists to 
evaluate the outcome as “successful?” Both 
therapists are highly experienced men in their 
field; both had a rationale for their respec- 
tive procedures which on the evidence of this 
study differed quantitatively (Degree of In- 
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ference and Initiative) and perhaps qualita- 
tively (Type and Dynamic Focus). Rogers, 
in keeping with his theory, consistently re- 
flected the patient’s feelings, whereas Wol- 
berg, combifing analytic principles with re- 
educative techniques, attempted to effect 
therapeutic changes in his patient mainly by 
means of interpretation and guidance. But 
even if the patients could be equated it would 
not be possible to attribute differences in 
therapeutic outcome (whose measurement is 
another staggering problem) to variations in 
technique as long as relevant factors in the 
therapist’s personality are left out of account. 
Certainly, Wolberg was more “directive” (by 
Rogerian standards). But both therapists con- 
veyed an attitude of respect for their patients 
and implied their right to self-direction; both 
appeared to be warm, accepting, and non- 
critical; both encouraged the patient’s expres- 
sion of feelings; and both, by their thera- 
peutic performance, seemed to engender a 
feeling of greater self-acceptance in their pa- 
tients. These attitudes on the part of the 
therapist—he may have them in common with 
the mature person who can also be a good 
parent °—are as yet largely unexplored by 
objective research, but they may be the touch- 
stone of all therapeutic success, regardless of 
the theory.* Given the “basic therapist per- 
sonality” it may still be possible that some 
therapeutic techniques or combinations of 
techniques catalyze the therapeutic process 
whereas others are relatively inert; contrari- 
wise, no amount of training in technique may 
compensate for deficiencies in the therapist’s 
“basic attitudes.” To approach these prob- 
lems by research is difficult, but by no means 
impossible. 

5I have in mind Fromm’s “productive character” 
5). 
; tie is increasing evidence that the therapist's 
attitude may “cut across” theoretical orientations. 
For a comprehensive statement of the client-centered 
position, see Rogers’ discussion (8, pp. 19-64). On 
the other hand, Wolberg’s transcript offers evidence 
that respect for the patient, his capacities, his right 
to self-direction, and his worth as a human being 
can be conveyed even when the therapist makes in- 
terpretations. Fiedler’s studies (2, 3, 4) suggest that 
“experts,” irrespective of whether they subscribe to 
the analytic, Adlerian, or client-centered viewpoint, 
create highly similar “ideal therapeutic relationships” 
but, as Bordin (1, pp. 115-116) has pointed out, 
Fiedler’s findings cannot be regarded as evidence for 


or against the question of the importance to be at- 
tached to differences among theories. 
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It seems that altogether too little attention 
has been paid by researchers to the therapist 
and his contribution to the therapeutic proc- 
ess. In keeping with this conviction, I have 
focused upon one facet—the therapist’s tech- 
niques—and attempted to abstract common 
denominators from the therapist’s verbal op- 
erations. The isolation and measurement of 
common denominators in varying therapeutic 
techniques appears to be a needed research 
task which must be expanded by research on 
the therapist’s personality, from which tech- 
nique seems to be inseparable.’ 


Summary 


In an effort to compare the therapist’s ac- 
tivity in two forms of psychotherapy, a multi- 
dimensional system for analyzing therapeutic 
communications has been applied to two pub- 
lished case histories: a case treated by short- 
term therapy based upon psychoanalytic prin- 
ciples, and a case treated by client-centered 
therapy. 

The therapist’s activity in analytically ori- 
ented therapy showed statistically significant 
variations over the course of treatment in 
terms of Type of Therapeutic Activity, De- 
gree of Inference, Dynamic Focus, and Initia- 
tive. The data point to an intensification of 
therapeutic activity from the first to the 
fourth interview, at which time a number of 
relatively more inferential interpretations were 
advanced. In the seventh session, a similar 
intensification occurred, followed by a phas- 
ing out to the end of treatment. Interpreta- 
tions dealt principally with the patient’s cur- 
rent interpersonal situation; in addition, the 
therapist’s verbalizations were designed to 
achieve a degree of reeducation in the patient. 

As predictable from theory, the client-cen- 
tered therapist’s activity consisted principally 
of reflections of feeling. This therapeutic tech- 
nique was sustained, with minor fluctuations, 
throughout treatment. 

Intertherapist comparisons showed that the 
analytically oriented therapist used techniques 
which were generally more inferential and 


7 A three-year investigation, currently nearing com- 
pletion, deals with the techniques, therapeutic for- 
mulations, and attitudes of more than 200 therapists 
who responded as vicarious interviewers to a sound 
film of an initial interview. 


which showed greater Jmitiative than those of 
his client-centered counterpart. While the ini- 
tial interviews did not differ significantly in 
terms of /nitiative, the approach of the two 
therapists was nevertheless divergent on other 
dimensions. 

The primary implications of this prelimi- 
nary comparison relate to the comparative 
study of therapeutic techniques, which is con- 
sidered one of the most important frontiers of 
research in psychotherapy. The isolation and 
measurement of common denominators in the 
techniques of therapists adhering to different 
schools should lead to more definitive studies 
of the therapist’s personality, particularly of 
those attitudes which, wittingly or unwit- 
tingly, he brings to bear upon the therapeutic 
interaction. 


Received November 13, 1956. 
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One of the most formidable problems for 
researchers in psychotherapy is the use of 
control groups for the evaluation of treatment 
effects. The difference in results obtained from 
a no-treatment control group and an experi- 
mental (treatment) group should provide the 
crucial test of the efficacy of a psychothera- 
peutic method. For both ethical and practical 
reasons, however, the utilization of this para- 
digm has in general not proved feasible.? From 
an ethical point of view, it is not considered 
proper to deny a sick patient available thera- 
peutic facilities, even for the sake of scientific 
certainty. In this light, it may seem incon- 
sistent that no-treatment controls are appear- 
ing with increasing frequency in pharmacologi- 
cal (as well as electro-shock and lobotomy) 
studies although there remains a relative 
dearth of these controls in studies of psycho- 
therapy. This is true despite the fact that 
both drug and psychotherapy investigations 
may use similar nonhospitalized patient sam- 
ples, with comparable diagnoses and levels of 
illness. There are, however, important differ- 
ences in the conditions under which studies in 
these two areas are conducted. To begin with, 
there is the matter of the length of the con- 


1This study is part of a larger research project 
supported by Research Grant No. M-532 (C-2) from 
the National Institute of Mental Health, United 
States Public Health Service. 

2 There are exceptions to this generalization, eg., 
the work of Rogers and his colleagues (6) and Bar- 
ron and Leary (1). The great preponderance of psy- 
chotherapy studies, however, have no control pro- 
visions at all, at least in the sense in which we are 
concerned with control design in this paper. 


trol period. Where a drug is being tested, the 
control patient ordinarily is denied therapeu- 
tic intervention for only a few weeks, rarely 
more than a month. This relatively brief time 
interval is considered sufficient for evaluat- 
ing the potency of the drug, and hence, the 
control period is adequate. Unfortunately, the 
efficacy of psychotherapy cannot be assessed 
in so short a time. Psychotherapy controls 
would have to undergo a no-treatment period 
of at least six months, possibly much longer, 
if they are to match the average time most 
patients spend in treatment. Few psychothera- 
pists are willing to expose patients to this 
condition, if some treatment form is available. 
Furthermore, there is the very practical 
problem of retaining control patients in the 
experimental design. Should a large number 
of patients deliberately withdraw or drop out 
prior to completion of the experiment, a 
biased sample may remain which differs in 
important characteristics from that popula- 
tion which was the original focus of investiga- 
tion. The likelihood of a large dropout is much 
greater in psychotherapy than in drug studies, 
primarily because of the differential in con- 
trol time described above. It is a rare case 
where a distressed patient is willing to wait 
six months or longer for professional help, as 
would be the situation for the therapy con- 
trol. Moreover, the motivation and degree of 
illness of the patient who does accept this 
limitation become a matter of conjecture. 
There is still another distinction between 
drug and therapy controls which bears com- 
ment. Theoretically, a control patient should 
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have no contact with the clinic or treatment 
source during the control period since even 
brief contact may be equated with treatment. 
This restriction is frequently violated in drug 
studies where contro] patients sometimes re- 
ceive placebos and usually maintain clinic con- 
tact over the control period, at least for “pre- 
scription renewal” purposes. These patients 
are given to understand and believe that they 
are being “treated.” Furthermore, there is 
evidence that placebos alone can effect sig- 
nificant changes in the physical and emotional 
status of patients (7). It is a moot question, 
therefore, whether placebo patients may be 
accurately described as receiving “no treat- 
ment.” On the other hand, no-treatment con- 
trols in psychotherapy ordinarily have no 
clinic contacts at all and are fully aware that 
they are not receiving formal treatment. Un- 
der such conditions it is not surprising that 
these controls are harder to collect and retain 
in a study, and that their use is more often 
challenged on ethical grounds. 

Psychotherapy studies are also done, of 
course, on hospitalized patients. With this 
“captive” population, the problem of retain- 
ing patients in the design is less crucial. Yet, 
even in institutional settings psychotherapy 
studies, with or without control provisions, 
have been rather unsatisfactory. Partly this is 
because it is difficult to collect a sufficiently 
large sample of treated patients, let alone con- 
trols, due to the paucity of trained psycho- 
therapists and the long treatment periods re- 
quired. However, this does not explain the 
relative absence of controls for those psycho- 
therapy studies that are done in hospitals. 
The use of controls is a quite common prac- 
tice for evaluating electro-shock and lobotomy 
procedures in the same settings. This discrep- 
ancy may have something to do with the fact 
that, compared to the latter two types of 
treatment, psychotherapy may be considered 
more tried, less radical, and of no great po- 
tential danger to patients. That is, a more 
cautious design specifically including controls 
may be thought prudent where more “con- 
troversial” treatments are being examined. 
Unfortunately, the designs used in typical 
studies of shock and lobotomy are hampered 
by other complications. For example, although 
the controls in these studies are excluded 
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from a clearly defined experimental condition, 
strictly speaking they are not no-treatment 
controls since usually the patients are ex- 
posed to other treatment influences, e.g., the 
general therapeutic atmosphere of the hos- 
pital plus occupational therapy, recreational 
therapy, or other special activities. 

It seems, then, that although the use of no- 
treatment controls would improve the elegance 
of experimental designs, their application to 
the evaluation of psychiatric treatments has 
encountered serious ethical and practical ob- 
stacles. These obstacles have been particu- 
larly prominent in investigations of psycho- 
therapy where, as a consequence, no-treatment 
controls are used only infrequently. The most 
common type of design in current psycho- 
therapy studies consists of the comparison of 
the pre- and posttherapy status of a single 
treatment group or the comparison of two or 
more groups, differing in techniques or meth- 
ods of treatment. One method or technique 
may be found superior to the others although 
all of them may produce change in patients, 
at least from their initial baseline. The failure 
to include control groups, however, leaves 
crucial questions unanswered. 

Some investigators have attempted to cir- 
cumvent the problem by utilizing modified 
types of control. These fall into two cate- 
gories, in terms of the populations used: (a) 
“Dropout” patients (or terminators) and (d) 
“Wait-list” patients. 

1. Dropout patients are those individuals 
who applied for treatment but either deliber- 
ately, or for reasons beyond their control, 
did not keep any treatment appointments, or 
terminated treatment very early, usually with- 
out the approval of the therapist. The flaw in- 
volved in the use of these patients as controls 
is that they represent an obviously self-se- 
lected sample whose motivation for help, on 
a behavioral basis alone, is quite different 
from patients accepting and receiving treat- 
ment. In the dropout sample there are usually 
a few patients who are unable to keep treat- 
ment appointments for “accidental” reasons, 
seemingly unrelated to any otherwise negative 
attitudes toward treatment. Closer examina- 
tion of the reasons offered often reveals that 
these reasons simply have provided convenient 
opportunities for avoiding and rejecting treat- 
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ment (3). In shori, these patients generally 
differ from treated patients in one significant 
respect, their unwillingness to accept treat- 
ment. In the case of those few patients whose 
reasons for rejecting treatment appear quite 
plausible (e.g., family suddenly moved to an- 
other state), the unavailability of the patients 
for treatment generally means that re-evalua- 
tions of such patients at later periods are also 
impossible. 

2. Most treatment centers are unable to 
offer therapy immediately to new patients 
and, except for acute cases, routinely require 
wait periods of varying lengths before the be- 
ginning of formal treatment. In using these 
wait-list patients as controls, the investigator 
has the advantage of the availability of indi- 
viduals presumably not different from those 
presently in treatment but who, for reasons 
not determined by themselves, must undergo 
an interval of time without access to formal 
treatment. A study utilizing these patients 
must make certain that the wait-list repre- 
sents an unbiased sample and is reduced sys- 
tematically, i.e., that patients are not assigned 
differentially from the wait-list to a par- 
ticular kind of psychotherapy or a particular 
therapist. It is not uncommon for patients 
with more acute symptoms, or with especially 
“interesting” or unique problems, to receive 
priority in assignment, thus making the wait- 
list a repository of “superficial” or unwanted 
cases.* 

Recently, the rather ingenious device of 
“own-controls” has been adopted to obviate 
the difficult matter of matching control and 
treated groups (4). Here the changes in a pa- 
tient during the wait interval are compared 
with the changes in that same patient during 
treatment. The one serious drawback in this 
approach is that the wait period almost in- 
variably is shorter than the treatment period 
itself. The lack of equivalence of the two 
periods reduces the validity of comparisons. 
Aside from the temporal inequality, the own- 
control design has another shortcoming, which 

*Grummon, for example, placed a client “in the 
own-control group only if it seemed that waiting 
was not likely to cause him serious discomfort or 
harm ... (and) assignment to the wait group was 
occasionally changed to the no-wait group if... 


the client developed anxiety during the waiting pe- 
riod . . .” (4, p. 46). 
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is shared by all wait-list approaches: the 
largely undetermined influence of the promise 
of treatment on the patients’ status (1). The 
knowledge that treatment is in the offing may 
work to improve or worsen the condition of 
some patients, but to what extent, if at all, is 
a question that only recently has received 
some attention (6). 

Because of the many complicating factors 
found in the efforts to use traditional and even 
modified controls, Watterson (8) has sug- 
gested another possible solution. “When we 
test the efficacy of a given drug, we give 
tablets of the actua! drug to the experimental 
group and control group. Neither the thera- 
pist ... nor the patients themselves know 
which are the real and which the dummy pills. 
It is possible and logical to think about psy- 
chotherapy in a parallel way; the patient re- 
ceives a unit of supposed treatment, but this 
unit may contain the necessary ingredients or 
it may not. There is nothing fanciful or un- 
usual about such a point of view. We are quite 
used to judging a kind of psychotherapy as 
being likely to succeed or to fail because it 
contains or fails to contain this or that in- 
gredient” (8, p. 239). Watterson adds that a 
test of this approach will require the precise 
statement of hypotheses relative to the signifi- 
cant elements in the treatment and consequent 
changes in the patient. In this way, testable 
predictions may be made concerning patient 
change as a function of some specific element 
or technique. Experimental and control groups 
would be used, but neither patient nor thera- 
pist would be aware of what hypotheses were 
being tésted. The study reported herein serves 
as an illustration of this alternative design 
suggested by Watterson. 

Previous unpublished data accumulated by 
the authors in a series of pilot studies indi- 
cated that the amount of contact between pa- 
tient and psychotherapist is an important fac- 
tor influencing improvement rate. Because this 
factor appeared so consistent over several cri- 
teria, and seemed to account for so much of 
the variance between improved and unim- 
proved patients, it was felt that it should be 
controlled in the next sequence of studies. In 
the present experiment, therefore, a group of 
patients was formed in which patients were 
to have minimal contact with their therapists, 
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in contrast to the more traditional techniques. 
The limitation was imposed in terms of both 
number of appointments over a period of time 
and the amount of time which each of these 
separate appointments consumed. The specific 
hypothesis to be tested was that patients who 
have fewer and briefer sessions of psycho- 
therapy will show significantly less improve- 
ment in the effectiveness of their social rela- 
tionships than patients with measurably more 
and longer psychotherapeutic sessions, over 
approximately the same period of time. 


Procedure 


A total of 54 psychiatric patients * who ap- 
peared at the Henry Phipps Psychiatric Clinic 
Outpatient Department between June, 1953 
and October, 1954 were included in this study. 
Most patients were psychoneurotic, diagnosed 
mainly as anxiety and depressive reactions, 
and the remainder had personality disorders. 
Patients with diagnoses of organic brain dis- 
ease, antisocial character disorder, alcoholism, 
overt psychosis, or mental deficiency were ex- 
cluded. The patients were assigned to one of 
three different forms of psychotherapy: 


@. Analytically oriented group therapy, in which 
patients were seen in groups of five to eight once a 
week for about one and one-half hours; 

b. Analytically oriented individual therapy in which 
patients were seen at least one hour a week; 

c. Minimal contact therapy in the Continued 
Treatment Clinic (CTC) of the Phipps Outpatient 
Department where patients were seen individually 
no more than half an hour once every two weeks. 


Three psychiatrists participated in the study. 
They were in the second year of their resi- 
dency, and had approximately equivalent ex- 
perience in both group and individual treat- 


*The total sample actually was 91 patients, but 
those patients (NV = 28) who had three or less psy- 
chotherapy sessions were excluded from the present 
study. The design required a sample of 54 patients 
who had at least 4 treatment sessions. To make al- 
lowance for anticipated “dropouts” (arbitrarily set at 
3 or less sessions), extra patients were assigned ini- 
tially and, consequently, 63 patients were treated in 
the program. Nine patients were omitted from the 
present analysis to balance the design and simplify 
statistical computations. The major condition for 
omission was poor attendance at therapeutic sessions. 
Criterion test scores were not appreciably altered in 
any of the cells or groupings affected by removal of 
the 9 patients. 





ment. Each psychiatrist treated a roster of 
eighteen patients, six in each of the three 
forms of treatment. Assignment of patients to 
form of treatment and to therapist was made 
on a random basis by the research staff. 
Neither patient nor psychiatrist, therefore, 
could influence either choice of treatment or 
choice of therapist (or patient). Patients in 
the three treatment forms showed no signifi- 
cant differences in age, sex, diagnosis, social 
class, length of illness, or marital status. 

Prior to treatment assignment, each patient 
was given a structured interview by the re- 
search psychiatrist, and the interview was 
observed through a one-way screen by the re- 
search psychologist. The focus of the inter- 
view was the patient’s day-to-day relation- 
ships with each significant individual in his 
life (e.g., spouse, siblings, children, parents, 
boss, co-workers, male and female peers, etc.). 
The extent of the patient’s ineffective behav- 
ior with each individual during the four-week 
period immediately preceding the interview 
was rated on a six-point scale in each of 15 
categories: overly-independent, superficially 
sociable, extrapunitive, officious, impulsive, 
hyper-reactive, overly-systematic, overly-de- 
pendent, withdrawn, intrapunitive, irrespon- 
sible, overcautious, constrained, unsystematic, 
and sexual adjustment. The interviewing psy- 
chiatrist and the psychologist-observer made 
separate ratings on the patient, and then in 
conference, after comparing differences in rat- 
ings,° completed a series of joint ratings based 
on the consensus of their conference discus- 
sion. A similar interview was also conducted 
with a relative (or close friend) of each pa- 
tient by the research social worker and a sec- 
ond psychologist-observer, the interview simi- 
larly focusing on the patient’s ineffective 
social behavior. The independent ratings of 
interviewer and observer were also confer- 
enced, resulting in a set of joint ratings. Fi- 
nally, the two interviewing teams met to pool 
their accumulated information and make a 
single final series of ratings in each of the 15 
categories. The ratings for all categories were 
summed, resulting in a total Social Ineffective- 
ness score. 


5 Reliability studies indicate that interrater corre- 
lation over the 15 categories of the Ineffectiveness 
scale is approximately .69. 
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The experimental design required re-evalua- 
tion of all patients six months after treatment 
commenced, again after twelve and twenty- 
four months, and periodically thereafter. Data 
from the first (six-month) re-evaluation is re- 
ported in this paper. 


Table 1 


Mean Number of Months and Sessions of Psycho 
therapy for Group, Individual, and 
CTC Patients (WN = 54) 





Treatment Months Sessions 
Results - —- 
F “_* , Grou 5.0 15.8 
lable 1 indicates that up to the time of the Sdlicidea! $1 177 
first re-evaluation, all three categories of pa- CTC 5.5 


tients had been in treatment over approxi- 
mately the same period of time (from 5.0 to 
5.5 months),* but there was a difference be- 
tween them in terms of average number of 
therapeutic sessions attended over this pe- 
riod. The number of sessions for group pa- 
tients was 15.8, for individual patients 17.7, 
and for CTC patients only 9.3. On the aver- 
age per month, then, group patients had 3.2 
sessions, individual patients 3.5 sessions, and 
CTC patients 1.7 sessions. In short, over the 








analyzed to determine the relative efficacy of 
the three forms of therapy, the influence of 
the three psychiatrists, and the importance of 
the interaction between therapy and psychia- 
trist. As a statistical control for certain dif- 
ferences between types of treatment due to 
differential dropout rates, and for a small cor- 
relation between initial and change scores, the 
analysis of covariance was applied. The re- 


























‘ same period of time patients in individual sults summarized in Table 2 indicate a highly 
" treatment and those in group treatment had Significant difference between types of treat- 
5 approximately twice as many therapeutic con- ™ent. Group patients improved more than 
" tacts as patients assigned to the Continued CTC patients (p< .01), and individually 
S Treatment Clinic. treated patients also improved more than 
“ Patient change (or “improvement”) was those treated in CTC (p < .05). No differ- 
* measured by the algebraic difference between Ce was found between group and individu- 
: the initial (pretherapy) Ineffectiveness scores lly treated patients. There was, however, 
% and the first re-evaluation Ineffectiveness Some slight evidence of a difference in the in- 
i f , ; ; “~ 

le scores. These difference or change scores were ‘luence of the psychiatrists (p < .20). 

in ®A few patients in each type of treatment termi- Discussion 

t- nated before the 6-month evaluation, reducing the ze. 7 3 

d mean number of months of treatment in each cate- The hypothesis that fewer and briefer ses- 
2: gory to less than six sions of psychotherapy reduce improvement 
sd 

a- Table 2 

C- Analysis of Covariance of the Ineffectiveness Scale Scores for Treated Patients (V = 54) 

li- LS ——————— 
ve Source df =X? =XY ry? Adj.2Y? df Adj.MS’ PF 
of nem ; —— ; —— 
r- Therapist 2 100.48 42.83 169.00 151.65 2 75.83 2.19** 
nd Therapy 2 615.82 —14.12 294.70 354.79 2 17740 8 5§.11* 
- Interaction 4 39.63 —13.10 92.60 105.81 4 26.45 

01 Within 45 2309.33 785.83 1827.00 1559.59 44 35.44 

a 

15 Total 53 3065.26 © 801.44 2382.30 
”" Pooled Error 49 234896 772.73 1919.60 166540 48 34.70 
ve- Therapist + Pooled Error 51 2449.44 815.56 2088.60 1817.05 50 

Therapy + Pooled Error $1 2964.78 758.61 2214.30 2020.19 50 

_ *> < 01. 
1eSS “> < .20 


Note.—X and Y refer to Initial and Difference scores. respectively. 
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has been confirmed within the framework of 
the paradigm suggested by Watterson. An as- 
sumed “necessary ingredient” of treatment 
was diluted or restricted for certain patients 
whose improvement rate was thereby ad- 
versely affected. A step by step verification of 
other supposed necessary ingredients of psy- 
chotherapy might well proceed in analogous 
fashion. 

Specifically, the present study indicates that 
reduction of interpersonal ineffectiveness is 
more dependent on the number and extent of 
contacts between therapist and patient than 
on the particular psychotherapeutic technique 
used. CTC patients had measurably less con- 
tact with their therapists than did other pa- 
tients, and they derived significantly less bene- 
fit in terms of the criterion employed.’ The 
difference in treatment results between the 
three psychiatrists is only suggestive rather 
than conclusive. Since the general background, 
training, and experience of all therapists were 
similar, personality differences between thera- 
pists would seem to warrant further study. 

The exact significance of the ingredient of 
treatment identified in this study is not en- 
tirely clear. Perhaps the most parsimonious 
explanation is that there is an optimum “dos- 
age” of psychotherapy, in terms of amount of 
patient-therapist contacts, but the minimal 
amount or dosage received by CTC patients 
was inadequate, at least by comparison with 
the more conventional amounts received by 
the other patients in the study. There are, 
however, other possible interpretations of our 
results. Recently, it has been pointed out that 
the “placebo effect” observed in pharmaco- 
logical studies may have direct implications 
for psychotherapy (7). In this context, the 
placebo effect was defined as those changes in 
the patient produced primarily “by the pa- 


7 Other criteria utilized in the larger study, of 
which this is a part, did not reflect similar changes. 
Improvement, however, is by no means a simple, 
unitary phenomenon and it may vary depending on 
the criterion of change (5). Social ineffectiveness was 
selected for exposition in the present paper because 
the scale measuring that factor is much the same as 
the original one used in the pilot studies where the 
tentative findings concerning amount of contact and 
improvement were made. The other criteria have un- 
dergone considerably greater revision from their origi- 
nal forms. 
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tient’s faith in the efficacy of the therapist 
and his technique” (7, p. 301). It was sug- 
gested that the adequate evaluation of any 
treatment form requires that it be compared 
with another treatment form in which pa- 
tients have equal faith. Admittedly, the pres- 
ent study meets this requirement only indi- 
rectly, if at all. The conviction with which the 
patients accepted the different types of treat- 
ment and the therapists was not investigated. 
Yet there appears to be no compelling reason 
why patients should have the least faith in 
the CTC treatment. To most patients who 
come to a public clinic, this kind of brief in- 
termittent contact with a physician is similar 
to their other medical experiences, and pre- 
sumably it was what they expected and sought 
when they requested psychiatric help. At least 
initially, therefore, there seems little reason to 
consider the CTC patients as not having con- 
fidence in the treatment form to which they 
were assigned. On the other hand, the thera- 
pists themselves may have had unequal faith 
in the efficacy of the three treatments, espe- 
cially viewing the limited contact form as in- 
ferior, and conceivably could convey this atti- 
tude in quite subtle ways to the CTC patients. 
However, if we can assume that rate of drop- 
out (or early termination) is an inverse meas- 
ure of faith in treatment, the CTC patients 
cannot be said to have possessed less of this 
attribute. Other studies by the authors have 
shown that the dropout rate for the patients 
in group therapy was appreciably higher than 
for patients in CTC (2). 

Finally, it might be suggested that number 
and extent of therapeutic contacts are them- 
selves crucial for instilling faith in a therapist 
and his techniques. That is, a patient’s con- 
viction that he will be helped is not neces- 
sarily something that he brings to treatment 
initially, but is a kind of confidence that is 
built up through fairly intensive and frequent 
therapeutic contacts. Hence, the inferiority of 
the CTC treatment might simply reflect the 
less amount of faith of patients assigned 
thereto, as a consequence of their limited 
therapeutic contacts. If this interpretation is 
correct, the experiment may be said to con- 
firm the validity of the concept of placebo ef- 
fect as it operates in psychotherapy. 
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Summary 


The unavailability of nontreatment control 
groups to test the efficacy of psychotherapy 
and dissatisfaction with the use of dropout 
and wait-list groups as substitute procedures, 
has prompted the use of an alternate experi- 
mental design. This design requires the pre- 
cise statement of hypotheses relative to the 
presence or absence of an assumed significant 
element of treatment and the consequent 
changes in patients. Adopting this scheme, the 
present study specified that patients having 
fewer and briefer sessions of psychotherapy 
will show significantly less improvement than 
patients with more and longer sessions, over 
the same period of time. Fifty-four psychiatric 
patients were assigned at random to three psy- 
chiatrists, each of whom treated an equal 
number of patients in group therapy and two 
different forms of individual therapy. In one 
of these latter forms, the patients were able to 
have only one-half as many psychotherapy 
sessions and the sessions lasted only one-half 
as long as patients treated in the other two 
forms. Over a six-month experimental period 
the patients with restricted therapeutic con- 
tacts showed less improvement on the criterion 
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of change used. The significance of amount of 
therapeutic contacts is discussed. 
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A Comparison of “Remainers” and “Defectors” 
Among Child Clinic Patients’ 


Eugene E. Levitt 


Illinois Institute for Juvenile Research 


Studies by Rubinstein and Lorr (3) and 
Hiler (2) have indicated that certain back- 
ground data and test results distinguish adult 
patients who break off therapy prematurely 
from those who continue. The present study 
is an investigation of this phenomenon with 
child patients. The “defector” sample was a 
group of 208 cases who were accepted for 
therapy at a child guidance clinic, but who 
failed to appear for any treatment interviews. 
The “remainers” were 132 cases in which 
either the mother or the child had had at 
least 20 treatment interviews. Data were 
available for 61 variables. Fifteen of these 
concerned objective background description, 
like race, sex, mental age, religion, etc. Five 
dealt with the child’s health and toilet-train- 
ing history, 10 concerned symptoms and prob- 
lems, parental descriptions made up 9 vari- 
ables, 15 more dealt with parental handling 
of the child, 4 variables were obtained from 
projective test protocols, including severity of 
the disturbance and prognosis, and there were 
3 miscellaneous variables. With the exception 
of mental age, grade placement, and age at 
first examination, all variables were discrete. 
Analyses of differences between groups were 


1An extended report of this study may be ob- 
tained without charge from Eugene E. Levitt, Insti- 
tute for Juvenile Research, Chicago, Illinois, or for 
a fee from the American Documentation Institute. 
Order Document No. 5182, remitting $1.25 for mi- 
crofilm or $1.25 for photocopies. 


therefore accomplished by chi squares. The 3 
continuous variables were analyzed by ¢ tests. 
Thirty-four; or 56%, of the analyses had p 
values of .50 or higher, and 26, or 43%, had 
values of .70 or above. Only 8 reached beyond 
the .20 level, 5 of these attaining significance 
at the .05 level or beyond. The 5 variables do 
not seem to hang together in a logical fashion, 
nor does there appear to be any theoretical 
reason to expect them to be differentiating. 
In general, chance significance is suggested. 
Using the Brozek-Tiede approximation (1), 
the probability of finding 5 analyses of 61 
significant at the .05 level by chance is very 
nearly .25. The hypothesis of chance signifi- 
cance therefore seems tenable, and we may 
reasonably conclude that the remainer and 
defector child patients do not differ with re- 
spect to the 61 variables analyzed herein. 


Brief Report 
Received January 15, 1957. 
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The Effectiveness of Group Psychotherapy with Chronic 
Schizophrenic Patients and an Evaluation of 


Different Therapeutic Methods’ 


Ralph G. Semon 
VA Mental Hygiene Clinic, Lowell, Mass. 


and Norman Goldstein * 
Psychological Counseling Center, Brandeis University 


Since 1921, when the use of group psycho- 
therapy with the various types of psychoses 
was first reported (6), treatment by this 
method has shown a steady increase (4). De- 
spite general agreement about the value of 
group therapy in the treatment of chronic 
schizophrenic patients, however, research in 
this area has not kept pace with the clinical 
use of the technique. There have been only 
two objective studies, and of these only Sacks 
and Berger (11) published definitive results 
with statistical controls. They reported posi- 
tive changes in the intrahospital behavior of 
chronic schizophrenic patients after one year 


1 This study was conducted at the Boston State 
Hospital on whose staff the authors were employed 
at the time of the investigation. It was supported in 
part by a Research Fellowship from the National 
Institute of Mental Health, U. S. Public Health 
Service. 

This study is part of a doctoral dissertation com- 
pleted at Boston University in 1954, under the gen- 
eral direction of Drs. Chester C. Bennett, John Ar- 
senian, Austin W. Berkeley, and Nathan Maccoby, 
to whom the authors are indebted for their encour- 
agement and guidance. 

The authors wish to express their appreciation to 
Walter E. Barton, M.D., Superintendent, and to the 
staff of the Boston State Hospital, for their coopera- 
tion and assistance in this research. 

Special acknowledgment goes to the late Robert S. 
Johnson, M.D., for his assistance in the selection of 
the patients and for the facilitation of the work of 
the authors on the male chronic service of the Boston 
State Hospital where he was Senior Psychiatrist at 
the time of the study. 

2Now with the Psychiatry Service, Beth Israel 
Hospital, Boston. 
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of group therapy. Powdermaker and Frank 
(10) evaluated the effectiveness of group ther- 
apy with a similar group of psychotic patients. 
While their findings were not statistically sig- 
nificant, the authors were enthusiastic about 
its potential value. 

Another problem in need of further evalua- 
tion is the relative efficacy of different meth- 
ods of group treatment. Slavson (14), and 
Meiers (9), in their comprehensive analyses 
of current trends in group therapy, discuss 
the development of two major orientations. 
These have been described as “therapy in a 
group” and “therapy through a group” (1, p. 
343). Bovard (2) uses the analogous terms, 
“leader-centered” and “group-centered,” to 
distinguish methods used with nontherapy 
groups. The leader-centered method, or ther- 
apy in a group, assumes that the therapeutic 
potefitial is resident in the relationship formed 
between each member and the leader with the 
result that the focus of treatment is on the 
individual within the group. In the group- 
centered method, or therapy through a group, 
the assumption is that the motivation for 
change is contained within the emotional re- 
lationship established among the members of 
the group. In this method, the focus of treat- 
ment is on the group. 

There have been no controlled investiga- 
tions of the relative merits of these two ori- 
entations to group psychotherapy with chronic 
schizophrenic patients, although qualitative 
observations on their use have been reported 
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Table 1 


Description of Groups on Basis of Matching Variables 











HAS Subscale I> 








HAS * Total Total 
Score Score Hospitalization 
Groups (pre-therapy) (pre-therapy) (in years) Age 
Experimental 
I Mean 44.7 41.7 13 38 
Range 12.2-84.6 12.5-88.2 8-22 29-45 
II Mean 45.9 42.5 13 37 
Range 28.0-72.4 16.7-65.2 6-24 33-48 
III Mean 45.1 48.1 13 38 
Range 16.3-83.1 17.4-73.3 5-19 29-45 
IV Mean 44.6 42.9 14 37 
Range 14.9-89.2 13.0-91.7 5-23 26-45 
Control 
V Mean 48.3 50.2 14 38 
Range 14.3-81.6 3-21 25-45 





® Hospital Adjustment Scale. 
’ Communication and Interpersonal Relations. 


(13). Frank also discussed two different phi- 
losophies of treatment in group psychotherapy 
with such patients. After describing the work 
of exponents of these two methods, Frank 
concluded: “Our experiences allow no deci- 
sion as to whether the method of working pri- 
marily with the group, or working primarily 
with the individual patients, is better” (3, p. 
229). 

In view of the increasing use of group psy- 
chotherapy in the treatment of chronic schizo- 
phrenic patients, and the limited number of 
controlled studies in this area, it seemed pro- 
ductive to investigate the following problems: 
(a) the value of group psychotherapy for 
hospitalized chronic schizophrenic patients 
and (0) the relative effectiveness of different 
methods of treatment. The hypotheses were 
formulated as follows: 

1. Groups that receive psychotherapy will 
show clinical improvement; a group that re- 
ceives no psychotherapy will show no clinical 
improvement. 

2. There will be no significant difference in 
the relative therapeutic effectiveness of two 
methods of treatment, the one leader-centered, 
and the other group-centered. This is stated 
as a null hypothesis since there is not suffi- 





11.5-84.6 


cient evidence to indicate that one approach 
will be better than the other. 


Method 


Patient Population 


The experimental design called for five 
matched groups of chronic schizophrenic pa- 
tients, four experimental and one control. 
The N for each of the experimental groups 
was eight; the NW for the control group was 
seven. The patients who participated in this 
study were from the male chronic service of 
the Boston State Hospital. To maximize ho- 
mogeneity of the groups, the patients were 
selected on the basis of the following criteria: 
(a) they must all be from the diagnostic cate- 
gory of schizophrenia; (6) they must have 
been hospitalized at least two years without 
remission; (c) they must be within the age 
range of 20 to 50 years; (d) they must neither 
be mental defectives nor have any gross or- 
ganic pathology; and (e) they must not be in 
any other form of therapy or be subjects in 
any other research at the time of the study. 

The patients were then assigned to one of 
five groups. The groups were matched on the 
following variables: (a) over-all adjustment 
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in the hospital; (5) interpersonal function- 
ing; (c) length of total hospitalization; and 
(d) age. The measure of over-all hospital 
adjustment was the total score (pretherapy) 
on the Palo Alto Hospital Adjustment Scale 
(HAS), while interpersonal functioning was 
measured by the score (pretherapy) on sub- 
scale I of the HAS. This scale will be more 
fully discussed below. 

The matching on the variables of age and 
length of hospitalization was tested by non- 
parametric procedures for equal and unequal 
groups (7, 15). The ¢ test for the significance 
of the difference between uncorrelated means 
was applied to test the matching on the HAS 
total score and the subscale I score. No sig- 
nificant differences were found. Table 1 de- 
scribes the groups with respect to the match- 
ing variables. 


Groups 


The five groups were randomly designated 
as the control and experimental groups. Each 
of the four experimental groups had 50 hours 
of group therapy, meeting for daily sessions 
of one hour, five days a week for ten weeks. 
The mean attendance throughout the treat- 
ment period was 7.6. With the exception of 
two meetings the number of patients present 
was at least six. The control patients did not 
meet as a group, but were evaluated before 
and after the treatment period along with the 
experimental patients. All the patients con- 
tinued to receive the standard custodial treat- 
ment on the wards. 

The two therapeutic techniques used in 
this study were designated Active-Participant 
(AP) and Active-Interpretive (AI). The for- 
mer is comparable to the group-centered 
method, while the latter is comparable to the 
leader-centered method. The term active is ap- 
plied to both methods, since activity on the 
part of the therapist is seen as a necessary 
characteristic of work with psychotic patients. 
The methods were characterized by prescribed 
differences in the role of the leader. Of the 
four experimental groups, two were randomly 
selected as AP groups and twu as AI. The au- 
thors alternated as leader and observer so that 
each was the therapist in two groups, assum- 
ing the AP role with one group and the AI 
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role with the other throughout the treatment 
period. 


Leader Role 


The two leader roles were defined as fol- 
lows: 

a. In the Active-Participant role, the aim. 
of the leader was to promote interaction. To 
this end he functioned as a quasi-member of 
the group. His behavior was directed primarily 
toward the stimulation of group activity and 
the encouragement of participation on the part 
of each member. With this purpose in mind, 
the leader played a relatively inactive role in 
the determination of group issues, encouraged 
member-to-member interaction, promoted an 
attitude of mutual support and sharing of ex- 
periences and feelings, and minimized his in- 
vestigation of personality dynamics. 

b. In the Active-Interpretive role, the aim 
of the leader was to emphasize investigation 
and interpretation with a view to promoting 
understanding of underlying motivations. To 
this end he analyzed the feelings and attitudes 
of group members, and communicated to them 
his understanding of the dynamics. The leader 
was the central figure in the organization of 
the group. Thus, he played a relatively active 
role in the determination of group issues, 
clarified issues for the purpose of encouraging 
further investigation, investigated and inter- 
preted the motivations for member behavior, 
and focused on individual understanding of 
feelings and attitudes. 


Validation of Leader Role 


The degree to which both leaders were suc- 
cessful in assuming the two different roles was 
determined by having two resident psychia- 
trists, each with a minimum of a year of group 
psychotherapeutic experience, judge leader 
roles from a number of tape recordings of 
group sessions. 

Twenty-four time samples of ten minutes 
each from eight group meetings were used for 
this validation procedure. The eight meetings 
were selected at random from those sessions 
which were later to be used in the analysis of 
the data. The ten minute selections were taken 
from the early, middle, and late portions of 
each of the eight meetings. The time samples 
were equally divided between the AP and AI 
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Table 2 


Means, Pre- and Posttherapy, ¢ Ratios, and Probabilities for the Significance of the Difference 
Between the Means on the Palo Alto Hospital Adjustment Scale (HAS) 











Hospital Adjustment Scale 











Subscales 
Total Subscale Subscale Subscale I&il 
Groups Score Ie I> Itt Combined 
Experimental (combined) 
Means Pre 45.1 43.8 48.5 43.0 45.3 
Post 49.5 48.4 56.7 44.7 50.6 
t 1.60 1.81 1.56 36 1.90 
p <.10 <.05 <.10 >.35 <.05 
Control 
Means Pre 48.3 50.2 53.8 40.2 47.6 
Post 48.5 52.2 47.2 42.6 47.6 
t 03 33 — $1 .26 01 
p >.90 >.70 >.40 >.80 >.90 





* Communication and Interpersonal Relations. 
> Care of Self and Social Responsibility. 
* Work, Activities, and Recreation. 


sessions, and the order of presentation was 
randomized each time in an effort to mini- 
mize any patterning effect. 

Both judges correctly identified the role 
which the leader assumed in twenty-two of the 
samples (92%). For the other two samples, 
one judge identified the leader’s intent cor- 
rectly, while the other judge disagreed. Thus, 
96% of the judgments confirmed the leaders’ 
interpretations of role. 


Leaders 


The two leaders in this study were the au- 
thors of this paper. They were clinical psy- 
chologists who had each worked at least two 
years in a mental hospital. Each had a mini- 
mum of one year of group psychotherapeutic 
experience with psychotic patients. In addi- 
tion, each leader had personal experience as a 
member of a training group. 


Measures 


The measure of therapeutic effectiveness 
used was the Palo Alto Hospital Adjustment 
Scale (HAS). This is a rating scale designed 
to evaluate patients’ behavior in a psychiatric 
hospital, with emphasis on interpersonal be- 
havior (8). The scale is divided into three 
subscales, each of which can be scored sepa- 








rately. Subscale I measures “Communication 
and Interpersonal Relations”; Subscale II 
deals with “Care of Self and Social Responsi- 
bility”; Subscale III rates “Work, Activities, 
and Recreation.” Ratings are made by the 
ward attendant, and a score is obtained which 
gives a quantitative estimate of the patient’s 
adjustment in the hospital; the higher the 
score, the better the adjustment. 


Results 


Hypothesis I was tested by comparing pre- 
therapy scores with posttherapy scores on the 
HAS for the combined experimental group and 
also for the control group. The statistic used 
was the ¢ ratio for the significance of the dif- 
ference between correlated means. Since the 
prediction for the control group did not in- 
volve a directional trend, the two-sided test 
of significance was used. The prediction of 
positive changes in the experimental groups 
allowed the use of a one-sided test. 

Table 2 shows that the control group made 
no significant improvement on the HAS rat- 
ings. This was as predicted. The p value for 
the difference between the means of the pre- 
and posttherapy scores for the combined ex- 
perimental groups was < .10. This was ac- 
cepted as suggestive of an important trend, 
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but not as sufficient basis for confirmation of 
the hypothesis that group therapy makes a 
difference with respect to clinical improve- 
ment. 

On closer inspection of the results, how- 
ever, it became evident that the behaviors 
measured by Subscale I and Subscale II were 
most directly affected. Subscale ITI made no 
significant contribution to the over-all picture 
of improvement (Table 2). As was previously 
noted, Subscale III evaluates changes in the 
areas of Work, Activities, and Recreation. In 
retrospect, it seemed unreasonable to expect 
changes on this part of the HAS, as the op- 
portunities in these areas were relatively static 
and limited on the wards from which most of 
the patients came. 

Subscale III was therefore excluded, and 
the analysis was repeated using an HAS score 
based on Subscales I and II combined. The 
results (Table 2) show that the experimental 
groups, by this measure, manifested improve- 
ment at the .05 level of significance. Pre- and 
posttherapy comparisons for the control group 
showed no significant change. The findings on 
the HAS for Subscales I and II combined con- 
firm the hypothesis that group therapy will 
effect significant clinical changes in chronic 
schizophrenic patients. 

The experimental and control groups can 
also be compared in terms of the overlapping 
of their gains on the HAS. Three of the seven 
patients in the control group, 43%, showed 
more gain than the average experimental pa- 
tient. In the experimental groups, 18 of 32 
patients, 56%, showed more gain than the 
average control patient. These percentages 
show the existence of real differences, but em- 
phasize their small magnitude. 

Hypothesis 2 was tested by comparing the 
mean difference between the pre- and post- 
therapy HAS ratings for the combined AP 
groups with the mean difference between the 
pre- and posttherapy HAS ratings for the 
combined AI groups. The statistic used was 
the ¢ ratio for the significance of the differ- 
ence between uncorrelated means. A two-sided 
test of significance was utilized. The analysis 
was repeated for the scores based on Sub- 
scales I and II combined. No significant dif- 
ferences were found. The results of the sta- 
tistical treatment of the data did not permit 
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rejection of the hypothesis that there would 
be no significant difference in the therapeutic 
effectiveness of the two methods. 

When the data were analyzed separately for 
the groups, the findings suggested that there 
were additional factors operating to influence 
the results. There seemed to be some sort 
of interaction between an approach and the 
characteristics of the person assuming it. An 
analysis of covariance, however, showed no 
significant interaction. 


Discussion 


The finding that group therapy effects sig- 
nificant improvement in the interpersonal 
functioning of chronic schizophrenic patients 
is consistent with the qualitative observations 
of other workers, and similar to the results 
obtained by Sacks and Berger (11). 

This result is of interest since disturbances 
in interpersonal relations are considered as 
central to the pathology of schizophrenia. 
The goal of group psychotherapeutic endeavor 
with schizophrenic patients has been seen as 
one of social rehabilitation. Semrad, in sum- 
ming up the experiences of the staff at the 
Boston State Hospital, stated that “we felt 
the patients attain from group therapy a 
social rehabilitation rather than a definite 
change in their personality trends” (12, p. 
235). Gurri and Chasen observed “that some- 
times they re-learn the art of social inter- 
course without even losing their delusional 
trends” (5, p. 52). In groups, patients may 
gradually modify previously established, in- 
effective social attitudes and techniques and 
in this way be better able both to deal with 
others and to assume greater responsibility for 
their own needs. The positive results from 
group therapy in this study support observa- 
tions along these lines. 

The therapeutic goals in the present study 
were limited in the sense that interpersonal 
adjustment within the hospital setting was 
the criterion for improvement. With chronic 
schizophrenic patients, this was seen as a 
necessary first step. While it was demon- 
strated that these people respond in a posi- 
tive manner to such influences, it remains to 
be shown that through group therapy chroni- 
cally sick mental patients can gain sufficient 
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personality organization and drive to re-estab- 
lish social ties outside the hospital. 


Summary 


The effectiveness of group psychotherapy 
with chronic schizophrenic patients and the 
relative merits of two different therapeutic 
methods were evaluated. Thirty-nine patients 
were selected and assigned to five matched 
groups, four experimental and one control. 
The two methods of group therapy were desig- 
nated Active-Participant and Active-Interpre- 
tive, and were characterized by contrasting 
styles of leadership. The experimental groups 
each had 50 hours of therapy. The measure of 
therapeutic effectiveness used was the Palo 
Alto Hospital Adjustment Scale. 

The results permitted the conclusion that 
chronic schizophrenic patients improve in 
group therapy with respect to interpersonal 
functioning. Differences in the relative merits 
of the two methods of group therapy were not 
demonstrated. 
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The Need for Representative Design in Studies 
of Interpersonal Perception 


Wayman J. Crow 
Behavior Research Laboratory, University of Colorado 


Brunswik (3, 4, 5) has pointed out that a 
functionally oriented psychology requires a re- 
search design of its own. Using examples taken 
primarily from research on the perception of 
physical size, he has demonstrated the re- 
stricted nature of classical systematic design 
and the necessity for representative design. 
The purpose of this paper is to illustrate the 
implications of representative design as ap- 
plied to research on interpersonal perception.* 

In the typical interpersonal perception study, 
a group of subjects (Ss) is asked to observe 
another person (object). From this observa- 
tion, the Ss are required to estimate how that 
person (object) would respond to some pro- 
cedure. The Ss’ accuracy can be established 
by comparing their estimations with the actual 
performance of the object-erson. In the usual 
study, the Ss have been sampled from some 
defined population, and results can be general- 
ized to the parent population by well-known 
statistical procedures. Not generally heeded, 
however, is Brunswik’s admonition that the 
same requirements of statistical logic apply to 
the objects as well as the subjects of psycho- 
logical studies. 

If randomly-selected Ss are asked to esti- 
mate the responses of a number of randomly- 
selected object-persons to a personality ques- 
tionnaire, results may be generalized to both 
parent populations. The precision of the esti- 
mate of the population parameter is a func- 
tion of both the standard deviation of the 
sample and the inverse of the sample size, or 
N. Obviously, generalization to a population 
of Ss would not be risked if the sample of Ss 


1 Hammond has pointed out examples of failure tc 
consider object sampling (and its consequences) ir 
opinion polling (11) and clinical psychology (12). 
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was known to be unrepresentative of that 
population, or if only one representative from 
that population was provided in the study. If 
the representativeness of the sample cannot be 
established, then the use of statistical infer- 
ence is invalid. If only one representative of 
the population is included in the sample, then 
of course there is no way to estimate the 
sampling error, and therefore it is impossible 
to estimate what the results would have been 
if a different S had been used. This logic ap- 
plies with equal force to the person-objects 
used in experiments. It is especially important 
for the problem of interpersonal perception, 
for it must be granted that variability in per- 
sonality traits exists whether a person is per- 
ceiver or perceived, i.e., whether he is subject 
or object. It is mandatory, therefore, either to 
use an adequate sample of objects randomly 
selected from the population of objects to 
which we wish to generalize, or to forego 
generalization. 

But to forego generalization to object popu- 
lations leads to research of limited usefulness. 
Much of current research is focused upon in- 
terpersonal perceptiveness as a general ability. 
For example, Kelly and Fiske (15) used an 
interpersonal perception measure as a cri- 
terion of diagnostic competence for clinical 
psychologists, and Gage and Suci (11) have 
investigated the relationship of interpersonal 
perception and teaching ability. The knowl- 
edge that a subject is accurate in estimating 
a single object-person’s responses may be use- 
ful under special circumstances, but much 
more useful would be a measure that informs 
us about a S’s accuracy in general. If gener- 
alization is foregone, however, all that remains 
is the knowledge that a clinician accurately 
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estimated a particular patient’s responses; 
there is no way to determine how accurately 
he would estimate responses of “patients in 
general.” And, for the study of diagnostic 
competence, this is precisely the information 
needed. 

It is the contention of this paper that, at 
the present time, a “double standard” with 
regard to subject and object sampling in in- 
terpersonal perception research is the rule 
rather than the exception. For example, Luft 
(16) asked clinicians and nonclinicians to 
estimate how a patient would respond to a 
personality questionnaire. He concludes, “The 
results suggest that there is no direct rela- 
tionship between clinical training and the 
ability to predict verbal behavior of an indi- 
vidual” (16, p. 758). Luft does not caution 
his reader that in each of his two experiments 
he requires his subjects to make estimations 
for only one patient, i.e., one object. Luft 
does show concern for subject sampling (and 
thereby demonstrates the double standard) 
when he states, “What was surprising was the 
poor showing of the psychodiagnostician who 
had much more material on which to base his 
impressions than the other judges. Of course, 
his was only a single instance and it would 
be unfair to generalize” (emphasis added) 
(16, p. 757). 

In theoretically oriented studies, where gen- 
eralized ability is not the primary focus of at- 
tention, unrepresentative selection of objects 
still leads to conclusions of limited usefulness. 
Research concerned with the isolation, ma- 
nipulation, or interrelation of important vari- 
ables in interpersonal perception can be af- 
fected by “accidents” in the selection of 
objects as well as subjects. Conventionally, 
uncontrolled independent variables are as- 
sumed to have random effects upon the de- 
pendent variables of an experiment. The ex- 
perimenter attempts to insure the randomness 
of the effects of uncontrolled independent 
variables associated with the subjects by se- 
lecting the subjects at random from a defined 
population. Uncontrolled independent vari- 
ables associated with the objects, however, 
also affect the results; therefore the experi- 
menter should use as much care in the selec- 
tion of objects as he does with subjects (cf. 
12, p. 155). Otherwise, the results are affected 
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by accidents of selection in an unknown way 
and may become nonreproducible. 

An example is furnished by a study by 
Gage (9), who used four objects selected from 
students in an educational psychology class, 
plus two objects who were female clerical em- 
ployees. He pointed out that the latter“. . . 
were thus not members of the same ‘cultural 
subgroup,’ a fact which may help us under- 
stand some of the results” (9, p. 7). Gage 
found that the correlations between the Ss’ 
odd and even predictions for these two objects 
were negative, although the Ss’ odd-even cor- 
relations for the other four objects were posi- 
tive. Gage’s results were vitally affected by 
uncontrolled independent variables associated 
with the objects used in his study. Similar 
contradictions in results should be anticipated 
by experimenters whenever they fail to aban- 
don the “double standard” and do not use the 
same well-known sampling procedures in the 
selection of their objects that they have used 
conventionally in the selection of their sub- 
jects. 

Although Brunswik first stated the case for 
representative design roughly fifteen years ago, 
recent investigations of interpersonal percep- 
tion have failed to meet his criteria in one or 
more of the following ways: (a) either an in- 
adequate number of objects was used, (5) the 
population from which the objects were se- 
lected was ill-defined, or (c) the procedure 
for sampling the objects was not reported, or 
if reported was biased in an unknown way (1, 
2, 6, 7, 8, 9, 10, 14, 15, 16, 17, 18, 19, 20). 

As Brunswik has stated, “The case in which 
not only the responding subjects but also the 
stimulus objects are persons furnishes perhaps 
the most obvious demonstration of the neces- 
sity for representative design” (4, p. 213). 
The fact that representative design makes the 
experimenter’s task somewhat more difficult 
should not lead the experimenter to ignore 
the fact that representative design is intrinsic 
to the study of interpersonal perception. 
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In a previous edition of this Journal, Marsh 
et al. (1) proposed a sexual deviate (Sd) scale 
for the MMPI. This scale was devised to dis- 
tinguish incarcerated sexual criminals from 
normal individuals; in addition, the authors 
suggested that it may also find more general 
application in the clinical diagnosis and treat- 
ment of sexual pathology. Peek and Storms 
(3), who questioned the latter generalization, 
tested a small sample from a state hospital 
population, and concluded that the scale was 
primarily a measure of general abnormality. 
The present authors have sought additional 
cross validation by applying the scale to a 
more homogeneous population than is found 
in the previous studies. 











Table 1 
Comparison of Group Means by # Test 
L-H C-D Non-P _ 
S-D 1.46* a” 2.90** 
L-H 1.60* 7.95** 
C-D 6.99** 





* Significant at .10 level. 
* Significant at .01 level. 


The MMPI was administered to 105 Army 
enlisted men, ranging in age from 19 to 31 
years. Included in this group were four classi- 
fications (2): (a) Sexual Deviate (S-D), 21 
known, admitted deviates who were in proc- 
ess of discharge from the service and who had 
been examined by a psychiatrist to assure that 
the deviant sexual behavior was not symp- 
tomatic of a neurosis or psychosis; (5) La- 
tent Homosexual (L-H), 12 Ss whose be- 
havior was adjudged by a psychiatrist to be 
strongly influenced by latent homosexuality; 
(c) Character Disorder (C-D), 42 Ss who re- 
ceived the diagnosis of “character disorder,” 


A Note on the Clinical Validity of the Marsh-Hilliard- 
Liechti MMPI Sexual Deviation Scale 


William C. Holz, George F. Harding, and Sidney M. Glassman 
U.S. Army 
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excluding those who were in the “Sexual De- 
viate” subclassification; and (d) Non-Psychi- 
atric (Non-P), 30 Ss selected at random from 
an Army unit. 

Analysis of the data reveals that the stand- 
ard deviation of the scores of the S-D group 
is approximately twice that of the other 
groups, and their scores range from the high- 
est to the lowest of those tested. The signifi- 
cance of difference between group means was 
tested. The S-D’s are significantly differenti- 
ated from the Non-P’s, but not from the other 
maladjusted groups. Additionally, each of the 
other two maladjusted groups is equally well 
differentiated from the Non-P’s. The correla- 
tions of the Sd scale with four other scales 
which are especially sensitive to general psy- 
chiatric maladjustment are: F .74, K — .70, 
Pt .83, Sc 81. 

The authors’ conclusions are essentially the 
same as those of Peek and Storms (3), that 
the scale measures generalized psychiatric 
maladjustment. The findings of Marsh e¢ al. 
(1) apparently resulted from the gross dif- 
ferences between their comparison groups. 
This study indicates that their scale cannot 
be used to distinguish between sexual deviates 
and other maladjusted groups. 
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The Significance of Patient-Staff Rapport in the 
Rehabilitation of Individuals with Chronic 


Physical [Iness’ 


F. C. Shontz and S. L. Fink 
Highland View Hospital 


Rehabilitation of persons suffering from 
chronic physical illnesses attempts to restore 
these persons to levels of functioning as close 
as possible to their former states, within the 
limits of their disabilities. Usually this kind 
of rehabilitation includes physical and occupa- 
tional therapy as well as the necessary medi- 
cal and surgical services. 

The physical aspects of the rehabilitation 
process are fairly well understood and may 
best be left to the medical profession to ex- 
plain. Our understanding of psychological fac- 
tors, however, is in a relative state of specu- 
lation and conjecture, perhaps because the 
psychologist’s past contribution has consisted 
largely of empirical studies, many of which 
are probably of a doubtful value from a 
methodological point of view (2), and “clini- 
cal impressions,’ which lack the refinement 
of controlled investigation (3, 4, 5). An as- 
pect of rehabilitation which the present au- 
thors feel to be of primary psychological 
importance is the communicative relationship 
between the patient and the staff working 
with him. The importance of communicative 
sensitivity has been emphasized by Barker 
and Wright who suggest, “The worker will be 
most effective when he is sensitive to the clues 
given by the client as to the course their rela- 
tionship should take,” and, “Being sensitive 
to the client means that the worker must take 
into account the emotional meaning of the 
disability . . .” (3, p. 23). 


1 Credit is also due Messrs. Don K. Worden and 
Robert Postel, who served as assistants on this proj- 
ect, and to the Cleveland Foundation which provided 
the necessary funds. 
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In the present study, research efforts were 
concentrated upon patient-staff relationships 
in the areas of physical and occupational ther- 
apy for two reasons: (a) in the hospital set- 
ting available, both departments were known 
to keep regular progress notes on all patients, 
and (5) the two services represent related but 
not psychologically identical treatment situa- 
tions. Further, the hospital at which this 
study took place possesses a special intensive 
treatment program for certain groups of pa- 
tients. The program (now in its experimen- 
tal stage) provides for more frequent ther- 
apy; higher staff-to-patient ratio; regularly 
scheduled ward conferences; frequent medical 
rounds; and group psychotherapy by a psy- 
chologist. A ready-made situation was there- 
fore available in which patient-therapist un- 
derstanding (or “communicative rapport’’) 
could be quantitatively related to the vari- 
ables “‘progress in rehabilitation” and “par- 
ticipation or non-participation in an intensive 
treatment program.” 

The semantic differential (7, 8) was se- 
lected as the best measurement of communi- 
cative rapport, since it not only provides in- 
dices of similarity between subjects (Ss) in 
terms of the connotative meanings of verbal 
concepts, but it also implicitly allows for sta- 
tistical manipulation of a S’s directly ex- 
pressed associations without regard to meas- 
ures based on interindividual comparisons. 

Specifically, the purpose of the present re- 
search was to use semantic differential meth- 
odology to evaluate two research propositions 
and their corollaries: 

I. Patients with chronic physical illnesses 
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who are on an intensive treatment program 
have closer communicative rapport with (a) 
occupational therapists, and (5) physical 
therapists, than do patients not on an inten- 
sive treatment program. 


As a corollary: 





I-a. Semantic concepts relevant to patients’ 
current situations possess different semantic 
values for patients on an intensive treatment 
program than for patients not on an intensive 
treatment program. 

II. Patients who are judged by therapists 
to have shown great improvement in occupa- 
tional and physical therapy have closer com- 
municative rapport with these therapists than 
do patients who are judged to have shown 
little improvement in these areas. 


As a corollary: 








II-a. Semantic concepts relevant to pa- 
tients’ current situations possess different 
semantic values for patients judged to have 
shown great improvement in occupational and 
physical therapy than for patients judged to 
have shown little improvement in these areas. 

Supplementary investigations. In addition 
to the tests of the major hypotheses, the pres- 
ent study concerned itself with investigating 
the effects of the variables “age,” “sex,” and 
“length of hospital stay,” upon communica- 
tive rapport. Each variable was evaluated to 
determine whether patients of either sex, or 
at different levels of age or length of hospital 
stay, formed homogeneous semantic groups 
with high intragroup communicative rapport. 
Further tests were conducted to determine 
whether patients of either sex or at particu- 
lar levels of age or hospital stay showed espe- 
cially close communicative rapport with physi- 
cal or occupational therapists; and, finally, 
tests were conducted to determine whether 
patients of either sex or at different levels of 
age or hospital stay showed any differences in 
directly expressed semantic values, without 
respect to measures derived from interindi- 
vidual comparisons. 


Method 
Measurement 


The semantic differential employed was 
made up of twelve concepts rated by all Ss 
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on each of twelve associative scales, a total of 
144 ratings per administration. Seven con- 
cepts were selected to tap associations of cur- 
rent situational significance to the patient 
population: Occupational Therapy; Physical 
Therapy; Rehabilitation; Independence; Bed; 
Surgery; and Psychology. Five concepts were 
selected to represent associative areas the im- 
portance of which was felt to be relatively 
independent of hospital life as such: Life; 
Dream; Fate; Future; and Old Age. The 
scales were constructed to represent the three 
dimensions: Evaluative, Potency, and Activ 
ity. Each scale was characterized by the 
names assigned to the two extreme ends of 
its continuum. The four Evaluative scales 
were: Good-Bad; High-Low; Happy-Sad; 
and Beautiful-Ugly. The four Potency scales 
were: Tough-Tender; Heavy-Light; Full- 
Empty; and Sharp-Dull. The four Activity 
scales were: Fast-Slow; Jumpy-Calm; Fero- 
cious-Peaceful; and Tense-Relaxed. Each S 
rated every concept from one to seven on 
each of the scales, a value of one indicat- 
ing semantic agreement with the first named 
extreme of the continuum (Good, Happy, 
Tough, Sharp, etc.), a value of seven indi- 
cating semantic agreement with the second 
named extreme (Bad, Sad, Tender, Dull, 
etc.). Numbers from two to six were used for 
degrees of agreement between these two ex- 
tremes, and the number four represented the 
middle category. 

The instrument provided two types of meas- 
ures: (a) the index D, corresponding to the 
“Index of Semantic Harmony,” described in 
detail by Osgood and Suci (7, 8), and (5d) 
the absolute values of the subject’s directly 
expressed ratings. The index D is a mathe- 
matical expression of the semantic “distance” 
between individuals, and the less this distance 
(the smaller the D) the greater the communi- 
cative rapport between the individuals being 
compared was inferred to be. The absolute 
ratings assigned by the S reflect his directly 
expressed semantic associations; and these 
values were used to compare one group of 
individuals with another without recourse to 
the interpersonally derived index D. 


Administration. Because of the physical 
condition of many of the patients, the seman- 
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tic differential was administered orally and in- 
dividually, with the examiner recording each 
S’s responses. Randomness of stimulus pres- 
entation was obtained by listing each concept 
and scale on separate 3” x 5” cards, which 
the examiner shuffled before each administra- 
tion. A concept card was selected and placed 
before the S where he could refer to it as de- 
sired. Next, the scale cards were randomly 
presented by the examiner while the S made 
appropriate semantic judgments. Scale cards 
were reshuffled between each concept presenta- 
tion until the S had rated all concepts on all 
scales. 

The instructions included a brief descrip- 
tion of the procedure, an explanation that 
this was a research to “find out how people 
feel about hospital life,” and directions to 
provide numerical association values accord- 
ing to the “meaning the words have for you.” 
Clarification was provided as necessary, but 
most questions were answered with an indi- 
rect “whatever you think,” or “any way you 
like.” 


Reliability. Reliability was established in 
an earlier study which demonstrated that 
variations between ratings assigned by the 
same individual on successive administrations 
of the semantic differential were significantly 
smaller (p less than .001) than variations be- 
tween ratings assigned by individuals paired 
at random. This represents evidence of suffi- 
cient reliability for purposes of the present 
research. 


Subjects 


The Ss were selected from the patient 1d 
professional population of Highland View 
Hospital, Cleveland, Ohio. Twenty-seven pa- 
tients, nine occupational therapists, and seven 
physical therapists participated. Patients were 
not selected on the basis of their affliction 
with any particular physical illness, although 
there was reason to believe that all groups 
were closely comparable with regard to “de- 
gree of disability” as such. Patients with 
known organic brain pathology were specif- 
ically excluded. 

Of the 27 patients, 24 served in the in- 
vestigation for possible age, sex, and hospitali- 
zation differences. It was convenient to dis- 
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tribute these subjects into a 2 x 2 X 2 fac- 
torial design, with three Ss in every cell. The 
criteria for dichotomizing the variables were: 


1. The younger group included 12 patients under 
30 years of age; the older group included 12 patients 
over thirty years of age. (Mean ages: 22.5 years and 
40.7 years; average deviations: 2.3 and 6.6 years, 
respectively.) 

2. Twelve males and 12 females were included. 

3. The sample was divided according to whether 
the individual had been in the hospital for more or 
less than one year. (Mean number of months: 26.2 
and 4.7; average deviations: 7.8 and 2.7 months, re- 
spectively.) There were 12 patients in each group. 

Therefore, of the 24 individuals, three were males 
under 30 years of age who had been in the hospital 
less than one year. Three were males under 30 years 
of age who had been in the hospital more than one 
year. There were two similar groups of females; and 
a similar pattern was repeated for Ss over 30 years 
of age, making a total of eight cells containing three 
Ss each. The balance of the patients was used when 
necessary and appropriate in the tests of the three 
major propositions and their corollaries 


It was originally planned to match each 
patient-subject with his own physical and 
occupational therapist, but staff assignment 
rotations, personnel turnover, and other con- 
siderations mitigated against such a pro- 
cedure. The use of “composite therapists” 
seemed feasible, however, since, on an em- 
pirical basis, all therapists were found to show 
high interpersonal semantic agreements (com- 
municative rapport) within their own depart- 
ments. So far as the items on the present 
scale are concerned, the associations expressed 
by any one therapist were generally good pre- 


’ dictors.of the associations expressed by them 


all, interindividual variability being far more 
characteristic of the patients than of the staff. 
The nine occupational therapists were there- 
fore used in the compilation of the Median 
OT, a composite derived by taking the median 
associative value assigned by the therapists to 
each of the 144 ratings on the semantic dif- 
ferential. Patients were compared to the com- 
posite, Median OT, for the measurement of 
communicative rapport. The seven physical 
therapists were used in the compilation of a 
Median PT, which also served as a standard 
for the measurement of semantic similarity 
between patient and therapist as required. 
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Procedure 


The nonparametric Mann-Whitney U test 
described by Moses (6) and Auble (1), was 
the statistical technique employed. The U test 
is applicable when samples are of equal or un- 
equal size and implies no assumptions about 
parameter distributions. 


Preliminary investigations. The tests of 
variables, other than intensive treatment, 
which might affect a patient’s semantic asso- 
ciations utilized the 2 x 2 x 2 factorial de- 
sign. Within this framework, 54 pairs of Ss 
were selected so that, for each variable, 18 
pairs represented homogeneous matchings at 
one extreme of the variable (e.g., old with 
old); and 18 pairs represented homogeneous 
matchings at the other extreme (young with 
young). The remaining 18 pairs were cross- 
matchings on the variable in question (old 
with young). The null hypothesis, that D 
scores from the homogeneous matchings and 
from the cross-matchings were samples from 
a common population, was evaluated with one- 
tailed tests of significance. 

Two-tailed tests were used to evaluate the 
null hypothesis that D scores, between pa- 
tients and therapists, would be samples from 
a common population when tested in terms of 
the patient’s sex or level of age and hospital 
stay. 

The final test required the use of the pa- 
tient’s directly expressed semantic associa- 
tions. These were tested to evaluate the null 
hypothesis that semantic values expressed by 
groups of patients, selected to vary system- 
atically with respect to sex, age, and duration 
of hospital stay, represented samples from a 
common universe. Two-tailed tests were made 
on each variable. 


Propositions and corollaries. The major re- 
search propositions required statistical evalua- 
tion of series of D scores obtained through the 
comparison of appropriate patient and thera- 
pist subject-pairs. The corollaries involved the 
comparison of directly expressed semantic as- 
sociations, without regard to interindividual 
similarities. Since directional predictions were 
made in the propositions, one-tailed tests were 
made of the necessary null hypotheses; and 
since no directional predictions were made in 
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the corollaries, two-tailed tests were employed 
for these evaluations. 

Ratings of progress in physical and occu- 
pational therapy were required for the tests of 
proposition II. In occupational therapy, these 
ratings were obtained on 22 patients. The Ss 
were rated by their current therapists on two 
aspects of occupational therapy: Functional 
Capacities and Activities of Daily Living. 
Ratings were summed for all patients, who 
were then divided into two groups of 10 each: 
those who were judged to have shown rela- 
tively great improvement, and those who were 
judged to have shown relatively little improve- 
ment. Two Ss had shown only slight over-all 
improvement, and these Ss were not used since 
they seemed to be adequately representative 
of neither extreme. Similar ratings of prog- 
ress in physical therapy were obtained, except 
that Functional Capacities only were judged. 
Twenty-six Ss were rated and divided into a 
“great improvement” group of 14 Ss and a 
“little improvement” group of 12 Ss. 


Results 


Preliminary Investigations 


In terms of D scores, patients in this sam- 
ple were not found to form homogeneous se- 
mantic groups on the basis of age, sex, or 


Table 1 


Mean Values Assigned by Patients with Different Levels 
of Hospitalization to the Scales on Which 





Hospitalization time 





Less 
than 
1 yr. 


More 
than 





Scale-group 

Evaluative — 
Good-Bad 2- 
Happy-Sad 2+ 
Beautiful-Ugly 2+ 
Sharp-Dull 3-— 
Tough-Tender 5- 
Heavy-Light 5- 
Jumpy-Calm 5+ 
Full-Empty 3— 


<.025 
<.025 
<.025 
<.025 
<.025 
<.025 
<.05 
<.05 
<.05 





Note.—Resulte rounded to nearest whole number. (+) indi- 
cates this number was lowered to nearest whole. (—) indicates 
this number was raised to nearest whiole. 

* Individual scales only were subjected to detailed analysis. 
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length of time in the hospital. The data pro- 
vided no evidence that the variables age, sex, 
or length of time in the hospital significantly 
affected the communicative rapport (meas- 
ured by D scores) between patients and the 
Median OT and Median PT composites. 

The directly expressed semantic values of 
younger patients were not significantly dif- 
ferent from the directly expressed semantic 
values of older patients; nor were the directly 
expressed semantic values of male patients sig- 
nificantly different from those of female pa- 
tients. Significant differences were found be- 
tween the directly expressed semantic values 
of patients in the hospital less than one year 
and the directly expressed semantic values of 
patients in the hospital more than one year 
(see Table 1). 


Table 2 


Levels of Significance of Differences in Semantic 
Similarity Between Patients and Therapists, 
for Patients Receiving and Patients Not 
Receiving Intensive Treatment 








Fortestsof For tests of 
D scores D scores 
between between 
patientsand patients and 
occupational physical 
therapists therapists 
Measure p pP 
Total semantic differential <.10 <.04 
Scale group Evaluative chance chance 
Scale group Potency <.05 <.01 
Scale group Activity <.10 chance 
Scales 
High-Low <.05 <.10 
Tough-Tender <.025 <.01 
Heavy-Light <.10 <.10 
Full-Empty chance <.10 
Ferocious-Peaceful <.025 <.025 
Tense-Relaxed <.05 chance 
Concepts 
Fate <.01 <.10 
Occupational Therapy chance <.10 
Physical Therapy <.05 <.10 
Psychology <.02 <.10 
Rehabilitation <.10 <.025 
Surgery <.10 <.10 





Note.—One-tailed tests. All differences are in the direction 
of patients receiving intensive treatment showing greater simi- 
larity (smaller D) with therapists than patients not receiving 
intensive treatment. 
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Table 3 


Levels of Significance of Differences in Semantic 
Similarity Between Patients and Therapists, 
for Patients Regarded as Showing 
Great and Little Improvement 


For tests of D scores 
between patients and 
occupational! therapist 





Measure p 
Total semantic differential <.01 
Scale-group Potency <.01 
Scale-group Activity <.01 

Scales 

Beautiful-Ugly <.025 
Tough-Tendex <.01 
Full-Empty <.025 
Fast-Slow <.01 
Jumpy-Calm <.025 
Ferocious-Peaceful <.01 

Concepts 
Dream <.025 
Fate <.05 
Future <.01 
Life <.01 
Occupational Therapy <.01 





Note.—One-tailed tests. All differences are in the direction 
of patients showing great improvement having greater simi 
larity (smaller D) with therapists than patients showing little 
improvement. 


To supplement this analysis, tests were run 
on the frequency of use of extreme ratings 
(1’s and 7’s) in these last two subject groups. 
The “less than one year” group showed a sig- 
nificantly greater use of 1’s on the Good-Bad, 
the Happy-Sad, the Beautiful-Ugly, and the 
Sharp-Dull scales (p less than .01). Slightly 
less significant were the results in the same 
direction on the Full-Empty and the Jumpy- 
Calm scales (p less than .05). The “greater 
than one year” group failed to show signifi- 
cantly higher frequencies of 1’s or 7’s on any 
of the scales. 


Propositions and Corollaries 


Significant results obtained in the statisti- 
cal analyses of the research propositions and 
their corollaries are summarized in Tables 2 
and 3. On the basis of these results, proposi- 
tion I was considered to have been verified. 
Proposition II was verified with respect to the 
relationship between patient-therapist com- 
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municative rapport and rated progress in 
occupational therapy, but not with respect 
to the relationship between patient-therapist 
communicative rapport and rated progress in 
physical therapy. Corollaries I-a and Il-a 
were not considered to have been verified by 
the present investigation. 


Discussion 


The fact that males and females did not 
form homogeneous semantic groups in our 
study may be accounted for in either of two 
ways. The instrument itself may not be sensi- 
tive to such differences, or the size of the 
samples may have been too small for signifi- 
cant differences to appear. The same reason- 
ing may be used to explain our results on the 
age variable, but in this case there also could 
have been too small an age difference between 
the groups to have resulted in significant dif- 
ferences on the test. Both variables would be 
worth investigating in future research, since 
no definite conclusions may be drawn from 
the results obtained here. 

For the variable “length of time spent in the 
hospital,” the results were much more defini- 
tive than for age and sex. The results clearly 
indicated that the length of time patients 
spend in this institution has some measurable 
effect upon their feelings as reflected in the 
semantic differential. It seems that at least 
those aspects of their lives that are repre- 
sented in the test tend to change in meaning 
for them as they remain longer in the hospital 
setting. Apparently what happens is that 
“things in general” become less Good, less 
Happy, less Beautiful, less Full, less Sharp, 
less Tender, less Light, and less Calm. One 
finds what seems to be a growing indifference 
in attitude, and only future research can tell 
whether an actual reversal of feelings eventu- 
ally could occur. Patients in most hospitals 
have a minimum of contact with the more 
stimulating experiences healthy people en- 
counter. Their lives are routinized to such an 
extent that one day or week is the same as 
the next. Much of their lives is lived in fan- 
tasy, and reality no longer offers much emo- 
tional satisfaction. There is good reason to 
suspect that the patient’s level of motivation 
tends to decline with the passage of time in 
an institution, a problem which deserves fu- 
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ture investigation in relation to the findings 
of the present research. 

Goals in rehabilitation is ancther factor 
which we feel to be of prime importance. The 
patient’s own initial expectations may tend to 
go far bevond what is realistic in terms of the 
nature ana extent of his disability. Under such 
circumstances it could be expected that his 
attitudes, feelings, and motivation would show 
changes as he finds himself unable to reach 
his goal after months or years of effort. 

The verification of proposition I in the 
present investigation demonstrates the exist- 
ence of a measurable relationship between in- 
tensive treatment and the communicative rap- 
port of therapist and patient. It was apparent 
that communicative rapport between patient 
and therapist was generally greater for pa- 
tients receiving intensive treatment than for 
patients not receiving such care; and this 
greater rapport existed to a significant degree 
in the patient’s relationship to both physical 
and occupational therapists. It is fairly cer- 
tain that these results are not contaminated 
by the effects of age, sex, or length of the pa- 
tient’s stay in the hospital. It is not so cer- 
tain that they were not made spurious through 
initial patient or staff selection in establish- 
ing the intensive treatment ward itself or 
through feelings which may have been gen- 
erated in the special-care atmosphere. 

The results of the analysis of proposition 
II suggest that communicative rapport is re- 
lated to rated success in occupational therapy, 
but that it does not bear a relationship to 
rated success in physical therapy. Physical 
therapy is highly formal; and the efforts of 
the therapist are typically directed toward the 
patient’s performance in specific, rigidly pre- 
determined procedures. Motivation is less a 
problem here than in situations where more 
is left to the patient’s initiative. Communica- 
tive rapport might therefore be expected to 
bear little relationship to his progress in 
physical therapy. Occupational therapy, in 
the hospital investigated, is a different kind 
of situation. Here, the patient is not so much 
presented with predetermined routines as he 
is encouraged to become interested in prac- 
tical activities which he must learn in order 
to return to society. Communicative rapport, 
in the sense of agreement with respect to the 
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value and efficacy of the program, is a sine 
qua non of success in occupational therapy; 
and it therefore follows that those who are 
judged to show progress in this area also show 
stronger semantic similarity with those who 
organize their programs. 

The data also suggest the possibility that 
intensive treatment of the type investigated 
here produces slightly more agreement be- 
tween patient and physical therapist than be- 
tween patient and occupational therapist, a 
fact which is consistent with the somewhat 
greater importance ascribed at this institution 
to physical recovery than to social and emo- 
tional readjustment; but it also suggests that 
intensive care could increase its total value 
by placing greater emphasis upon practical 
functional activities and by providing a more 
nearly equal encouragement of the patient in 
less purely medical areas. Although one can- 
not yet claim actual predictive ability for the 
semantic differential, the evidence does sug- 
gest that the instrument has very promising 
possibilities for individual diagnosis in this 
area. 

It is possible at this point to suggest a 
theory which may explain the findings of the 
present research and which may prove con- 
ducive to further investigation. It is the 
opinion of the present authors that patients 
receiving intensive treatment pattern? them- 
selves closely after the models provided by 
their therapists. The chronically ill person 
shares a certain psychological similarity to 
the child: He is dependent upon others for 
his maintenance and for assistance in learn- 
ing how to cope with the world. It is com- 
monly understood that, for the child, this 
“someone else” is the parent (9, Ch. XIII). 
For the ill person, it is the doctor, the thera- 
pist, and perhaps the institution itself. A 


2 The term “patterning” may be defined as some- 
thing more than behavioral imitation and/or passive 
assimilation of factual knowledge from another per- 
son, but as something less than complete equation of 
the total-self with the total-self of another indi- 
vidual. “Patterning,” as we define it, is selective and 
more limited than the traditional meaning of the 
term “identification” would imply. While the dy- 
namic mechanism through which patterning takes 
place may be similar to that of identification, the 
scope and content of the two processes are not the 
same. 
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proper study of the patterning process as such 
would require longitudinal follow-up of pa- 
tients from the time they entered rehabilita- 
tion until after they returned to their homes. 
But the present investigation not only does 
not contradict the theoretical idea, it tends to 
suggest that this process may actually be fa- 
cilitated by intensive treatment. To the ex- 
tent, therefore, that patterning is a necessary 
condition for maximum learning, intensive 
treatment may be considered an important 
contribution to the psychological side of re- 
habilitation itself. 

The theory may be more specifically out- 
lined as follows: 

1. Patterning of a patient according to the 
models provided by his therapists is a de- 
terminant of progress in rehabilitation. Pat- 
terning is least significant in areas which are 
formal, highly structured, and minimally de- 
pendent upon the rapport of patient and 
therapist (e.g., physical therapy). It is of 
greatest significance in areas which are in- 
formal, less highly structured, and dependent 
upon favorable personal feelings between pa- 
tient and therapist (e.g., occupational ther- 
apy, social case work, psychotherapy). 

2. Patterning is facilitated by intensive 
treatment, at least in those areas which are 
emphasized by the treatment program itself 
(e.g., “purely medical” vs. “purely practical” 
orientations). It is probably also facilitated 
by an agreement between patient and thera- 
pist which antedates actual patient-therapist 
contact. 

It would seem to be true, on the basis of 
clinical observation, that a preexisting simi- 
larity of patient-therapist attitudes plus in- 
tensive treatment does present excellent con- 
ditions for rehabilitation success, although it 
remains for future research to determine 
which factors are most important and which 
are most easily affected by environmental fac- 
tors. It is also a legitimate problem for fu- 
ture research to determine the effect of 
communicative rapport between patient and 
therapist upon the quality of treatment pro- 
vided by the therapist. Although the present 
research demonstrates the existence of a rela- 
tionship between communicative rapport and 
“perceived progress,” it in no way specifies 
whether this relationship stems from vari- 
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ability in treatment quality, from distortions 
in perception of progress on the part of the 
therapists, or from a host of other possible 
determinants which may covary with the type 
of criterion here employed. 


Summary 


The semantic differential method was used 
to examine certain cognitive aspects of the re- 
habilitation process in a hospital for people 
with chronic physical illnesses. The factors of 
age, sex, length of time in the hospital, par- 
ticipation in an intensive treatment program, 
and improvement in physical and occupational 
therapy were investigated. Each variable was 
analyzed in terms of the “semantic distance” 
between patients and occupational and physi- 
cal therapists. Within the limits of the group 
of Ss selected, and with respect to the specific 
measuring instrument, no significant differences 
were found for the variables age and sex. 
The group of patients hospitalized more than 
one year were significantly different from the 
group of patients hospitalized less than one 
year in terms of directly expressed seman- 
tic associations, perhaps because of an in- 
creased indifference on the part of the former 
group. “Semantic distance” between patient 
and therapist was found to be significantly 
reduced under conditions of intensive treat- 
ment. It was also significantly reduced for pa- 
tients who had shown “great” improvement 
in occupational therapy, although the same 
was not true with respect to progress in physi- 
cal therapy. No differences in directly ex- 
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pressed semantic associations were found with 
respect to the variables “intensive treatment” 
and “improvement in therapy.” 

The data were explained on the basis of 
the hospital situation, the characteristics of 
physical and occupational therapy, and the 
type of intensive care provided. A theory of 
“patterning” was presented to unify the find- 
ings and to suggest further research. 


Received October 6, 1956. 
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Behn-Rorschach and Rorschach under Standard and 
Stress Conditions 


Fred Schwartz and Solis L. Kates 


University of Massachusetts 


The present investigation is designed to 
evaluate the equivalence of the Behn and 
Rorschach tests under standard and stress 
conditions. The issue of their equivalence is 
especially relevant to the use of the Behn with 
the Rorschach in studies of stress (3). To 
date, a small number of investigations have 
indicated that the two tests are generally 
equivalent under standard test conditions (1, 
2, 7). As the Behn is said to have a more 
clearly delineated form than the Rorschach 
(7), it was hypothesized that the Behn will 
have more FSh, FK, Fc, FC’, and FC re- 
sponses, and that the Rorschach will have 
more SkF, KF, and CF responses. For the 
remaining comparisons under standard and 
stress conditions, the null hypothesis was 
tested. 


Procedure 


Four groups, each consisting of six female 
students, were tested and retested within one 
to two weeks with counterbalanced Behn and 
Rorschach blots. Both tests were administered 
according to Klopfer’s directions (5), with 
the exception that the Ss, where necessary, 
were asked to repeat the inkblot series, giv- 
ing additional responses to each card, so that 
a minimum of two responses per card for 
Cards I through IX, and four responses for 
Card X, were obtained. This procedure was 
used to control R. 

Prior to the second test administration, one 
Rorschach group and one Behn group were 
exposed to experimental stress. The stress pro- 
cedure consisted of the identical typewritten 
personality interpretation, supposedly based 
on each S’s first inkblot administration, which 
advised that the S was poorly adjusted (4). 
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The stress employed was thus psychological 
in nature. 

Following the first administration, the Ss 
were divided into matched pairs on the basis 
of their Rorschach psychograms, and their 
scores derived from the Manifest Anxiety 
Scale (8). The latter was included in order to 
preselect and equate the groups on a variable 
which may be relevant to stress. One member 
of each matched pair was then randomly as- 
signed either to the experimental or control 
group. Following the second test administra- 
tion, all protocols were coded and scored fol- 
lowing the methods described by Klopfer (5). 


Results 


The means and standard deviations in 
Table 1 were compared by a three factorial 
(Type III) “mixed” analysis of variance de- 
sign (6). All hypotheses were tested at the .05 
level of significance. Departure from normality 
was not corrected for, on the basis of the 
Norton study as discussed by Lindquist (6). 
Heterogeneity of variance was “corrected” by 
setting a higher apparent level of confidence, 
i.e., the F table was entered at the .025 level 
of confidence to evaluate a hypothesis at the 
05 level (6). To account for possible “infla- 
tion of probabilities” due to the number of F 
tests being computed, the significance of the 
final results was estimated by Wilkinson’s 
tables (9) for the binomial expansion. 


Matching of Controls and Experimentals 


The matching of control and experimental 
psychograms was evaluated by determining 
the main and simple effects of the two treat- 
ments. No significant differences were ob- 
tained for 16 comparisons. 
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Table 1 
Means, Standard Deviations, and Reliability Coefficients for 16 Preselected 


Rorschach and Behn Variables 





























Behn Rorschach E* Cc» 
Variable M SD M SD r r 
Pp’ 8.5 4.49 8.2 3.92 32 .58* 
F 10.2 3.29 10.7 3.74 16 Ai 
A 11.4 2.23 10.9 2.66 .24 21 
M 2.2 1.40 2.7 1.65 31 AS** 
FM 2.7 1.64 2.5 1.50 —.71* .53* 
m 0.6 0.75 0.7 0.79 —.54* 72* 
FSh 3.2 2.13 2.0 1.38 — .08 14 
SAF 0.7 0.85 0.7 0.79 — 01 19 
FK 1.3 1.14 0.2 0.50 — 04 .00 
KF 0.3 0.55 0.2 0.59 64* — .08 
Fe 0.9 1.13 0.8 1.04 16 07 
FC’ 1.0 1.29 1.0 1.15 .70* 09 
FC 1.7 1.27 0.9 1.01 15 —.10 
CF 0.8 1.45 1.6 1.61 —.14 34 
RT 16.8 8.61 19.2 9.53 37 33 
Rej 0.7 0.94 0.4 0.70 16 6A" 





*E = Reliability coefficients for experimental Ss, Behn vs. Rorschach. 
’C = Reliability coefficients for control Ss, Behn vs. Rorschach. 


* Significant at the .0S level of confidence. 
** Significant at the .10 level of confidence. 


Comparison of Behn and Rorschach 


Two of eight predicted differences between 
the Behn and the Rorschach were obtained, 
the Behn eliciting significantly more FK and 
FC responses. Three trends were also ob- 
tained, the Rorschach eliciting more CF and 
higher RTs, with the Behn eliciting more FSh 
responses. In this analysis, the evaluation of 
main effects was supplemented by determin- 
ing simple effects where significant interac- 
tions were obtained (6). The occurrence of 
two differences significant at the .05 level 
from among eight comparisons exceeds chance 
expectation (9). The null hypothesis for 
eight additional variables was upheld as pre- 
dicted. These results are presented in Table 2. 


Effect of Stress 


The null hypothesis was accepted for 14 of 
16 preselected Rorschach variables, the Behn 
and Rorschach protocols not changing differ- 
entially as a consequence of experimental psy- 
chological stress. Stress did differentially af- 
fect RT and FC, the Rorschach eliciting 
shorter RTs and less FC under stress than 
the Behn. There was also a tendency for stress 


to have a differential effect upon F, FM, and 
KF. The results of the analysis of variance 
are presented in Table 2. 

The occurrence of two significant differ- 
ences from among 16 comparisons does not 
appear to be a chance result with reference to 
“inflated probabilities.” One of the obtained 
F scores was significant at the .01 level and 
the other was significant at the .05 level, 
yielding an estimated combined probability 
value at the .05 level of significance (9). 


Effect of Manifest Anxiety Level 


The possible differential sensitivity of the 
Behn and the Rorschach to Manifest Anxiety 
Level scores was evaluated by testing the 
simple effects of anxiety level by the analysis 
of variance. In this analysis, the Rorschach 
elicited significantly more M responses in low 
anxious (M = 3.4, SD = 1.46) than in high 
anxious Ss (M = 1.8, SD = 1.49), while the 
Behn elicited significantly more FM responses 
in low anxious (M = 3.5, SD = 1.5) than in 
high anxious Ss (M = 1.9, SD = 1.38). How- 
ever, these two differences were obtained from 
among 32 comparisons and may be a chance 
occurrence (9). 
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Table 2 


Analysis of Variance for 16 Preselected 
Rorschach and Behn Variables 




















Source 
B* ABC* 

Variable F p F p 

W 0.07 2.63 

F 0.31 3.93 10 

A 0.40 0.04 

M 1.91 0.90 

FM 0.13 3.23 .10 

m 0.13 0.48 

FSh 4.81* 10 1.99 

SAF 0.00 1.03 

FK 17.41 O1 1.63 

KF 0.33 4.87* 10 

Fe 0.07 0.32 

FC’ 0.00 0.86 

FC 7.51 OS 5.48 05 

CF 3.83 10 0.70 

RT 3.06 10 55.40 01 

Rej 2.29 1.17 

® A—First and second inkblot administration. 


B—Behn and Rorschach inkblot administration. 
C—Experimental (stress) and control (nonstress) groups. 
* Variance heterogeneous, ? shifted from .05 to .10. 


Reliability Coe ficients 


The coefficients reported in Table 1 were 
derived from the analysis of variance (6), as 
computed by Eichler (2). The control and 
experimental groups were treated as separate 
one-dimensional designs so as to separate the 
variance contributed by the stress procedure. 
The final coefficients should be interpreted 
cautiously, as they are limited by a N of 12, 
and by the fact that the experimental condi- 
tions included the use of matched homogene- 
ous groups with R kept constant, thereby re- 
stricting the range of variation. 


Discussion 

The Behn and Rorschach differed signifi- 
cantly on two variables from among eight pre- 
determined comparisons. The two tests did 
not differ on an additional eight variables. 
These findings indicate that the respective 
stimulus characteristics of the two inkblot 
tests are similar in most respects, and appear 
to be dissimilar in only a few. 

The differences in the two variables—FC 
and FK——may be determined by the differ- 


ences in the stimulus properties of the two 
tests. It appears that the relatively less struc- 
tured stimulus properties of the Rorschach 
make FC and FK responses less likely to oc- 
cur, in comparison to the more highly struc- 
tured Behn.' Further support is given to this 
hypothesis by a significant increase in the 
number of FC responses to the Behn than to 
the Rorschach as a consequence of stress. As- 
suming that there are certain changes in the 
S due to the stress condition, the Behn, origi- 
nally holding out the greater possibility for 
FC responses, permitted the S in his presumed 
changed condition after stress to give still 
more FC responses. A similar evaluation may 
be made for the trend obtained with the KF 
variable. 

From the above discussion, one would ex- 
pect that the Rorschach, which tends to elicit 
higher RTs than the Behn under standard 
cou titions, would also elicit higher R7s in 
Ss under stress. The obtained results, how- 
ever, are the reverse of this expectation. An 
explanation of this finding is beyond the scope 
of the present investigation, but it demon- 
strates that the stimulus properties of ink- 
blots may interact with the experimental con- 
ditions in complex fashion. 

The discussed differences in the stimulus 
properties of the Rorschach and Behn are 
congruent with the moderate reliability co- 
efficients reported by Eichler (2) and ob- 
tained in the present study. It should be noted 
that most of the nonsignificant coefficients in 
this study occur with shading and color vari- 
ables. While these coefficients must be inter- 
preted -with caution, they support the conclu- 
sion that the stimulus properties of the two 
tests may be dissimilar for shading and color. 
These findings, subject to further verification, 
limit the degree to which the two tests are 
congruent. 

It has also been suggested (5) that such 
coefficients may provide some information on 
the degree to which some Rorschach cate- 
gories are stable. From this point of view, it 
should be noted that the introduction of stress 
results in significant reversals in the coeffi- 
cients for FM and m, a change from near zero 


‘This conclusion is also in agreement with the 
trend for the Behn to elicit more FSh responses and 
for the Rorschach to elicit more CF responses. 
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to significant coefficients of KF and FC’, and 
a change from significant to nonsignificant co- 
efficients for W, F, M, and Rej. It is there- 
fore hypothesized that these seven variables 
may be especially relevant for an evaluation 
of the effect of stress on Rorschach perform- 
ance. The results for KF and FC’ also sug- 
gest that the unreliability of these categories 
in the control group may be due to their ex- 
treme sensitivity to situational stress. In addi- 
tion, significantly high coefficients were ob- 
tained in the control group for six variables, 
even though the test-retest conditions may be 
dissimilar in some respects. This finding sup- 
ports the stability of these variables in the 
evaluation of personality characteristics. 


Summary 


The present investigation was designed to 
investigate the correspondence of the Behn 
and the Rorschach inkblot seiies under stand- 
ard and under stress condiiions for matched 
homogeneous groups. 

It was concluded that the Behn and Ror- 
schach are approximately equivalent for most 
response categories. It was also concluded that 
the obtained differences between the Ror- 
schach and Behn under standard and stress 
conditions may be attributed in part to dif- 
ferences in the stimulus properties of the two 
tests. 


Fred Schwartz and Solis L. Kates 


Some evidence was presented concerning 
the reliability of some Rorschach response 
categories in evaluating personality charac- 
teristics. 


Received November 12, 1956. 
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Visual and Verbal Presentation of TAT Stimuli 


Dell Lebo and Margaret Harrigan’ 


Richmond Professional Institute 


A previous publication by Lebo (11) indi- 
cated that TAT cards with the most consistent 
negative stimulus value were generally those 
described in negative terms by Murray in the 
TAT manual (12). Such a finding suggested 
that the card descriptions, if simply read to 
subjects, might elicit responses comparable in 
many ways to those obtained from the pic- 
tures themselves. 

If verbal descriptions of the TAT plates are 
capable of evoking responses similar to those 
given to the standard visual TAT card stimuli, 
several advantages become apparent. In the 
first place, the verbal test could be adminis- 
tered to groups of subjects without the neces- 
sity for sharing the pictures, and without some 
of the other difficulties encountered in group 
administration of the TAT. Verbal descrip- 
tions might also be easier to translate into a 
foreign language than having the pictures re- 
drawn to conform to another cultural group. 
Lastly, a verbal form of the TAT might be 
valuable as a projective test for blind persons. 
At present, only a severely limited number of 
adaptations of projective techniques are avail- 
able for such people. 


Problem 


It was the aim of the present investigation 
to study the results of a visual (card) and 
verbal (descriptive) presentation of TAT 
stimuli. Such investigation rested on two main 
postulates: 1. The descriptions of the TAT 
pictures furnished in the manual (12) were 
adequate descriptions of the cards. 2. Mate- 
rial elicited by responding to TAT stimuli 
could be measured and compared with mean- 


1The authors wish to express their gratitude to 
Drs. Donald P. Ogdon and Archer L. Michael for 
their assistance in obtaining subjects for this experi- 
ment. 


ingful results. There seemed to be no reason 
why the principles of interpretation applied 
to the TAT could not apply equally well to 
the verbal material. 

The hypothesis for the present study was 
that verbal TAT stimuli would provide cues 
similar to those of the cards themselves; that 
the cards and their verbal descriptions would 
be similar sufficiently in their effect upon the 
subject to evoke comparable responses. 


Method 
Procedure 


Thirty-two female students, between 18 and 20 
years of age, selected from introductory courses in 
psychology, were used as subjects. Previous research 
had shown that the responses of college students to 
the TAT did not differ significantly from those of 
the general population and that IQ was not a dif- 
ferentiating factor (3, 5). All the subjects were tested 
individually by the junior author. 

All 20 cards for adult female subjects were pre- 
sented in single individual settings. Garfield et al. 
(10) found no significant differences between the re- 
sults of tests given in one or two sessions. The cards 
were divided so that alternate numbers were pre- 
sented in the standard visual manner, and the others, 
in the form of verbal descriptions as presented by 
Murray (12). read aloud by the examiner. The de- 
scriptions for cards 14 and 20 were altered slightly 
from those of Murray. The phrase “man (or woman)” 
was changed to read “person.” The use of both words 
might have seemed confusing, and the use of either 
one would have made the descriptions less ambiguous 
than the pictures. Each description was repeated 
twice. Standard instructions for card presentation 
from Stein’s manual (14) were followed. For the 
presentation of the descriptions, suitable changes were 
made. Thus, “I am going to describe some scenes” 
was substituted for “I am going to show you some 
pictures.” 

The order and manner of presentation was rotated, 
eg., ABCD, BCDA, CDBA, and DCBA, to help 
counterbalance practice and fatigue effects. To keep 
the test material as nearly as possible in its original 
order, the first ten cards were presented before the 
second ten. Lebo (11) had suggested that it might 
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be emotionally disquieting to present a mixture of 
cards from both series. 


Evaluation of Data 


The data were evaluated by means of: ‘a) 
a word count; (5) an idea count (8); (c) a 
rating scale for emotional tone, or mood, of 
both the theme of the story and the outcome 
(9); (d) a rating scale for level of response, 
or the degree to which subjects included fac- 
tors required by directions and responded with 
feeling (15) ; and (e) the dynamic content (1). 
Since responses may be productive in quan- 
tity without being clinically significant, the 
amount of dynamic content should be a help- 
ful comparison of visual and verbal TAT 
stimuli. 

Normative data (5, 6, 9, 15) were also used 
as a basis for comparison with the data of the 
present investigation. 


Statistical Analysis 


Statistical treatment of projective material 
has long been a problem. Despite recent de- 
velopments and refinements, it still seems to 
be in an uncertain stage. This uncertainty is 
reflected in the present investigation in the 
inclusion of both parametric and nonpara- 
metric statistics. 


Results 


Results of an analvsis of variance for the 
different methods of evaluation are presented 
in Table 1. When responses to verbal de- 
scriptions and to the plates themselves were 
compared, there was no significant difference 
attributable to the method of presentation ex- 


Table 1 


Analysis of Variance for Methods of Evaluation 




















F for F for 
Method of Presen- F for Inter- 
Evaluation tation Cards action 
Word count 79 2.05** 1.46 
Idea count 43 2.10** 1.18 
Story mood 2.93 20.49** 5 Rv 
Outcome mood 2.10 4.35** 1.32 
Level of response 3.68* 7.68** 91 
Dynamic content 14 21 .09 





* Significant at the .05 level of confidence. 
** Significant at the .01 level of confidence. 


Dell Lebo and Margaret Harrigan 


Table 2 


Median Test for Significant Differences Between 
Responses to Visual] and Verbal Stimuli 











Method of 

Evaluation x 
Word count 02 
Idea count 0.0 
Story mood 4.01* 
Outcome mood 12 
Level of response 0.0 
Dynamic content 1.69 








* Significant at the .05 level of confidence. 


cept for level of response, which varied to a 
degree significant at the .05 level. As the level 
of response was slightly higher for the verbal 
descriptions than for the pictures, this differ- 
ence suggested that responses to verbal de- 
scriptions were not inferior in quality to stand- 
ard responses. 

The results of the median test are presented 
in Table 2. Again, there did not appear to be 
much significant variation between responses 
to visual and verbal presentation of the 
stimuli, except, this time, in emotional tone, 
which differed significantly at the .05 level. 
One explanation for this finding is that the 
emotional tone of the pictures may not be 
completely or accurately conveyed in every 
verbal description. 

Product-moment and rank-order correlations 
with normative data are shown in Table 3. 
On the basis of that table it may be said that 
the emotional tone of the stories (9), the level 
of response (15), and common themes (6) 


Table 3 


Correlations of Visual Stimuli with Normative Data 
and with Verbal Stimuli 





Rank Order 








Product-Moment 
Visual Visual Visual Visual 
Method of and and and and 
Evaluation Norm Verbal Norm Verbal 
Story mood 818* .795* .830* .708* 
Level of response 777* .76A* 694* 695" 
Common themes .622* .469* 





* Significant at .01 level of confidence. 
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were all elicited by verbal descriptions of the 
cards in a manner similar to those evoked by 
the pictures’ stimuli for the different cards. 

The agreement between judges’ rating of 
ten per cent of the data, randomly selected, 
ranged from 64% to 100% for the various 
methods of data evaluation. This agreement 
for a small sample rating seems fairly high, 
since, as Dana (3) has pointed out, agree- 
ment of 50 to 60 per cent is usual. Correlation 
coefficients for judges’ ratings ranged from 
.653 to 1.0, all significant at the .01 level of 
confidence. 


Discussion 


The results in general substantiated the hy- 
pothesis that verbal descriptions of TAT pic- 
tures evoked responses comparable to those of 
the TAT itself. Considering that the descrip- 
tions were not designed to elicit stories, but 
to identify the cards, the present writers feel 
that these findings are relatively encouraging 
for the utilization of the verbal test in fur- 
ther investigations, if not as an approxima- 
tion, of the pictorial test. 

It has been said that story telling is one of 
the oldest forms of the projective approach 
(13). Yet Bell has indicated that this method 
has “not progressed beyond the exploratory 
stages” (2, p. 71). Since “there are about as 
many ways of analyzing TAT stories as there 
are clinical psychologists who use the method” 
(7, p. 125), and many of the ways should be 
applicable to verbal TAT stimuli, perhaps the 
storytelling approach may leave the explora- 
tory stage and enter the experimental, via the 
path of card description. 

Even though an examination of the data de- 
rived from the visual or pictorial TAT and 
the verbal TAT stimuli revealed no consistent 
pattern favoring either procedure, it must be 
realized that these findings are limited. The 
circumstances of this study served to insure 
optimal conditions in that the subjects were 
youthful students with a background stressing 
reward through effort. Hence, they can be as- 
sumed to have cooperated to a maximum ex- 
tent. Only further investigation can safely in- 
dicate the degree to which the findings of the 
present study can be generalized. More spe- 
cifically, the verbal TAT would probably pro- 
duce very different protocols from the blind 


than were obtained in this study. Other in- 
vestigations have shown that projective test 
results with the blind often present seemingly 
abnormal evidence when compared to stand- 
ard norms (4). 

Much more work remains before verbal ad- 
ministration of the test may truly be useful in 
group administration, for translation into for- 
eign tongues, and for adaptation for blind or 
partially sighted individuals. Some of this 
work is now in progress and should be re- 
ported in the literature from time to time. 


Summary 


Previous work by Lebo had suggested that 
Murray’s verbal descriptions of the TAT 
cards were, in some respects, similar to the 
cards themselves. The present experiment 
compared the responses of 32 female college 
students to TAT pictorial and verbal stimuli. 

It was found that the substitution of verbal 
description for visual plates was apparently 
justified, insofar as the present subjects were 
concerned. For when the two methods of 
presentation were compared objectively on 
several bases it was found that one method 
did not appear consistently superior to the 
other. Indeed, despite the fact that card de- 
scriptions were not devised to replace the 
cards, responses to the verbal descriptions 
were more like than unlike responses to the 
cards, according to the measures employed. 


Received October 29, 1956 
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An Inventory for Assessing Different 
Kinds of Hostility’ 


Arnold H. Buss and Ann Durkee 
Carter Memorial Hospital, Indianapolis 


In their everyday functioning, clinical psy- 
chologists are alert to the ways in which hos- 
tility is expressed, and they are usually care- 
ful to distinguish various modes of expression. 
When the aggression is overt and direct, a 
distinction is usually made between verbal 
hostility and physical assault. Overt mani- 
festations are clearly separated from covert 
manifestations of hostility, e.g., cursing and 
threatening behavior vs. gossiping and round- 
about derogation. 

Since meaningful distinctions can be made 
between subclasses of hostility, a global 
evaluation of hostility would seem to contain 
considerable ambiguity. The statement “He is 
hostile” would apply equally well to a man 
who beats his wife and to a man who is spite- 
fully late for appointments. Thus, it should 
be expected that attempts to assess hostility 
would include not only a global estimate of 
intensity but also estimates of the intensity 
of the various subhostilities. 

The writers know of no published hostility 
inventory that attempts more than a global 
estimate of hostility. Three of the more re- 
cently developed inventories, those of Cook 
and Medley (2), Moldawsky (1), and Siegel 
(9), all consist of items selected from the 
MMPI by clinical psychologists. None of 
these investigators attempted to group items 
into subscales representing various aspects of 
hostility. Thus a nonsuspicious, assaultive in- 
dividual might receive the same score as a 
nonassaultive, suspicious individual. A score 
on one of these inventories would appear to 
be as ambiguous as the statement “He is hos- 


1 The writers wish to acknowledge the considerable 
efforts of Dr. Herbert Gerjuoy in obtaining subjects 
and in facilitating statistical analyses. 
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tile.” What is clearly needed is an inventory 
that attempts to assess the various aspects of 
hostility. This paper describes the develop- 
ment of such an inventory. 


Construction of the Inventory 


Varieties of Hostilities 


The first task was to define the subclasses 
of hostility that are typically delineated in 
everyday clinical situations. Such a classifica- 
tion was made in an earlier study (1), and the 
present classification is an elaboration of the 
previous one. 


Assault—physical violence against others. This in- 
cludes getting into fights with others but not de- 
stroying objects. 

Indirect Hostility—both roundabout and undi- 
rected aggression. Roundabout behavior like mali- 
cious gossip or practical jokes is indirect in the sense 
that the hated person is not attacked directly but 
by devious means. Undirected aggression, such as 
temper tantrums and slamming doors, consists of a 
discharge of negative affect against no one in par- 
ticular; it is a diffuse rage reaction that has no di- 
rection. 

Irritability—a readiness to explode with negative 
affect at the slightest provocation. This includes quick 
temper, grouchiness, exasperation, and rudeness 

Negativism—oppositional behavior, usually directed 
against authority. This involves a refusal to cooperate 
that may vary from passive noncompliance to open 
rebellion against rules or conventions. 

Resentment—jealousy and hatred of others. This 
refers to a feeling of anger at the world over real or 
fantasied mistreatment. 

Suspicion—projection of hostility onto others. This 
varies from merely being distrustful and wary of 
people to beliefs that others are being derogatory or 
are planning harm. 

Verbal Hostility—negative affect expressed in both 
the style and content of speech. Style includes argu- 
ing, shouting, and screaming; content includes 
threats, curses, and being overcritical. 
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Item-writing Techniques 


The writers constructed a pool of items and 
supplemented this pool with items borrowed 
from previous inventories. Most of the bor- 
rowed items underwent modification, and the 
following principles served as guides in writ- 
ing and selecting items. 


1. The item should refer to only one subclass of 
hostility, since an item that overlaps several cate- 
gories would not help in distinguishing patterns of 
hostility. 

2. The behaviors and attitudes involved should be 
specific, and the stimulus situations that arouse them 
should be near universal, eg., “It makes my blood 
boil to have people make fun of me.” “Makes my 
blood boil” is a fairly specific response, and being 
ridiculed is a common situation for most people. 

3. The item should be worded so as to minimize 
defensiveness in responding. It has been established 
that social desirability accounts for much of the 
variance of normals’ responses to inventories (4, 5). 
In attempting to facilitate respondents’ admitting to 
socially undesirable behaviors, three item-writing 
techniques were employed: 

First, assume that the socially undesirable state al- 
ready exists and ask how it is expressed, e.g., “When 
I really lose my temper I am capable of slapping 
someone,” “When I get mad, I say nasty things.” In 
these items the loss of temper is assumed, and the 
subject is asked only whether he expresses it physi- 
cally. This procedure emphasizes a report of behavior 
and tends to minimize the value judgments associated 
with hostility. 

Second, provide justification for the occurrence of 
hostile behavior, e.g., “Whoever insults me or my 
family is asking for a fight,” “People who continually 
pester you are asking for a punch in the nose,” “Like 
most sensitive people, I am easily annoyed by the 
bad manners of others.” When the item provides a 
rationale for the aggression, the subject’s defensive 
and guilt reactions are reduced, and he does not 
necessarily answer in the direction of social desir- 
ability. 

Third, use idioms, e.g., “If somebody hits me first, 
I let him have it,” “When I am mad at someone, I 
will give him the silent treatment.” Idioms have a 
high frequency of usage in everyday life, and these 
phrases are typically used by subjects to describe 
their own behavior and feelings to others. It is an- 
ticipated that these phrases will merely echo what 
the subject has previously verbalized, and therefore 
when such phrases apply, they will be readily ac- 
cepted and admitted. 

4. Take into account the effects of response set 
by including both true and false items. If all the 
items were scored in the direction of hostility only 
when marked “True,” a subject could get a low score 
simply by answering all the items “False.” Ideally, 
this kind of response set is best controlled when the 
number of true items equals the number of false 


Arnold H. Buss and Ann Durkee 


items. However, such equality was not feasible be- 
cause of the difficulty of constructing false items that 
met the other criteria. Therefore, a compromise ratio 
of three true items to one false one was adopted 


On the basis of the foregoing considerations, 
a pool of hostility items was compiled. Next 
it was decided to add the variable of guilt be- 
cause the relationship of guilt to the various 
subhostilities is of clinical interest. Accord- 
ingly, items were compiled for a Guilt scale, 
with guilt being defined as feelings of being 
bad, having done wrong, or suffering pangs of 
conscience. 


Item Analyses 


The first version of the inventory consisted 
of 105 items, with items from each scale ran- 
domly scattered throughout the inventory. It 
was administered in group fashion to 85 male 
and 74 female college students. In an attempt 
to reduce defensiveness, all protocols were 
anonymous. The various hostility scales and 
the Guilt scale were scored, and separate item 
analyses were performed for men and women. 

Two criteria were used in item selection: 
frequency and internal consistency. Frequency 
refers to the occurrence of the particular be- 
havior in the population, as measured by the 
proportion of the sample answering in the di- 
rection of hostility (or guilt). If a given be- 
havior is near-universal in the population or 
virtually absent, it obviously does not distin- 
guish between individuals. A criterion of fre- 
quency is necessary to eliminate items that 
are answered in one direction by virtually 
everyone, and it was decided to accept only 
items answered in one direction by 15-85% 
of the sample. 

Internal consistency was measured by the 
correlation of an item with the score of the 
scale in which it belonged. Since the items are 
scored dichotomously, the biserial correlation 
coefficient was used. The criterion for item 
selection was a correlation of at least .40 for 
both the male and female samples. 

Only 60 of the original 105 items met the 
frequency and internal consistency criteria. 
The number of items in several of the scales 
was so low that unreliability (lack of test- 
retest stability) seemed inevitable. Therefore, 
additional new items were written and old 
ones modified. Most of the modifications were 
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attempts to alter the frequency measure, i.e., 
decrease the popularity of items universally 
endorsed and increase the popularity of items 
rarely endorsed. 

The revised inventory contained 94 items. 
It was administered in group fashion to 62 
male and 58 female college students, and 
separate item analyses were performed for 
each sex. Again the minimum item-scale cor- 
relation was set at .40, but this time the 
frequency criterion was modified. The first 
item analysis had revealed sex differences in 
the proportion of the sample answering in the 
direction of hostility (or guilt). For several 
items the proportion of male students was 
over 15%, but the proportion of female stu- 
dents was under 15%. Since the 15-85% fre- 
quency criterion might eliminate items that 
differentiated between men and women, a less 
stringent criterion was adopted: 15-85% for 
either men or women. In addition, an attempt 
was made to insure that each scale contained 
items whose frequencies varied over a wide 
range. 

The second item analysis yielded 75 items, 
66 for hostility and 9 for guilt. It was found 
that more False items were discarded than 
True items, and the final form of the inven- 
tory contains 60 True items and 15 False 
items, a ratio of four to one. The items com- 
prising the final form of the inventory are 
listed in Table 1. Each item is grouped with 
the other items in its scale, and the False 
items are marked “F.” 


Social Desirability 


Responses to inventory items are at least in 
part determined by the respondent’s desire to 
place himself in a favorable light. This tend- 
ency assumes great importance in a hostility 
inventory, which deals with behaviors that are 
generally regarded as socially unacceptable. 
The potency of the tendency to give socially 
desirable answers has been demonstrated by 
Edwards (4). He had college students assign 
each of 140 personality trait items to one 
of nine intervals of social desirability. Scale 
values for social desirability were obtained by 
the method of successive intervals. Then the 
140 items were administered to different col- 
lege students, with standard inventory instruc- 
tions. The correlation between social desir- 


ability and probability of endorsing the items 
was .87. Subsequent studies with other inven- 
tories have confirmed the fact that social de- 
sirability is an important uncontrolled vari- 
able in many present-day inventories (5, 8). 

In constructing the present inventory, an 
attempt was made to minimize the variable of 
social desirability. In order to test the success 
of this attempt, Edwards’ procedure (4) was 
followed. The 66 hostility items of the final 
inventory were scaled for social desirability, 
using the method of successive intervals. The 
judges were 85 male and 35 female college 
students. The men’s and women’s judgments 
were quite similar, and they were pooled. 
Next, the inventories of 62 men and 58 women 
(who had previously taken the inventory and 
were different from the judges) were used to 
determine the probability of endorsement for 
each of the 66 hostility items. The product- 
moment r’s were .27 for the men and .30 for 
the women. Both correlations are significantly 
above zero at the .05 level of confidence, 
which suggests that the influence of social de- 
sirability is having a small but significant ef- 
fect on the direction of responding. 

However, these two correlations are consid- 
erably lower than the correlation of .87 re- 
ported by Edwards (4). In accounting for this 
discrepancy two differences between his study 
and the present one should be noted. First, 
the present items were designed to measure 
only the hostile components of personality, 
whereas Edwards’ inventory taps a variety of 
personality components. Since hostile acts are 
generally regarded as being socially undesir- 
able,-the upper end of the social desirability 
continuum is not represented in the present 
inventory. The present inventory ranged from 
extremely undesirable to moderately desirable 
behaviors; the inventory used by Edwards in- 
cludes not only extremely undesirable but also 
extremely desirable behaviors. 

The restriction of range can be clearly seen 
when social desirability scale values of the 
two inventories are compared. The scaling 
procedures were identical, but the present 
range of scale values was .23 to 2.38, while 
Edwards’ range was .50 to 4.70.2 A cur- 


2 Edwards’ scale values for social desirability and 
his probability of endorsement values were estimated 
from his Fig. 2 (4). 
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Table 1 
Items Comprising the Hostility-Guilt Inventory * 
(F = False items) 








Assault: 


1. Once ina while I cannot control my urge to harm 
others. (9) 
2F. I can think of no good reason for ever hitting 
anyone. (17) 
3. If somebody hits me first, I let him have it. (25) 
4. Whoever insults me or my family is asking for a 
fight. (33) 
5. People who continually pester you are asking for 
a punch in the nose. (41) 
6F.I seldom strike back, even if someone hits me 
first. (1) 
7. When I really lose my temper, I am capable of 
slapping someone. (49) 
8. I get into fights about as often as the next person. 
(57) 
9. If I have to resort to physical violence to defend 
my rights, I will. (65) 
10. I have known people who pushed me so far that 
we came to blows. (70) 


Indirect: 


1. I sometimes spread gossip about people I don’t 
like. (2) 

2F. I never get mad enough to throw things. (10) 

3. When Iam mad, I sometimes slam doors. (26) 

4F, I never play practical jokes. (34) 

5. When I am angry, I sometimes sulk. (18) 

6. I sometimes pout when I don’t get my own way. 
(42) 

7F. Since the age of ten, I have never had a temper 
tantrum. (50) 

8. I can remember being so angry that I picked up 
the nearest thing and broke it. (58) 

9. I sometimes show my anger by banging on the 
table. (75) 


Irritability : 


1. Ilose my temper easily but get over it quickly. (4) 
2F. I am always patient with others. (27) 
3. Iam irritated a great deal more than people are 
aware of. (20) 
4. It makes my blood boil to have somebody make 
fun of me. (35) 
5F. If someone doesn’t treat me right, I don’t let it 
annoy me. (66) 
Sometimes people bother me just by being around. 
(12) 
I often feel like a powder keg ready to explode. (44) 
I sometimes carry a chip on my shoulder. (52) 
I can't help being a little rude to people I don’t 
like. (60) 
10F. I don't let a lot of unimportant things irritate 
me. (71) 
11. Lately, I have been kind of grouchy. (73) 


o 


a all tad 


Negativism : 
1. Unless somebody asks me in a nice way, I won't 
do what they want. (3) 
2. When someone makes a rule I don’t like I am 
tempted to break it. (12) 


3. When someone is bossy, I do the opposite of what 
he asks. (19) 





4. When people are bossy, I take my time just to 
show them. (36) 

5. Occasionally when I am mad at someone I will 
give him the “‘silent treatment."’ (28) 


Resentment : 


— 


I don't seem to get what’s coming to me. (5) 

Other people always seem to get the breaks. (13) 

3. When I look back on what's happened to me, I 
can’t help feeling mildly resentful. (29) 

4. Almost every week I see someone I dislike. (37) 

5. Although I don't show it, I am sometimes eaten 
up with jealousy. (45) 

6F. I don’t know any people that I downright hate. 
(21) 

7. If I let people see the way I feel, I’d be consid- 
ered a hard person to get along with. (53) 

8. At times I feel I get a raw deal out of life. (61) 


i) 


Suspicion : 


1. I know that people tend to talk about me behind 
my back. (6) 

2. I tend to be on my guard with people who are 
somewhat more friendly than I expected. (14) 

3. There area number of people who seem to dislike 
me very much. (22) 

4. There are a number of people who seem to be 
jealous of me. (30) 

5. I sometimes have the feeling that others are 
laughing at me. (38) 

6. My motto is ‘ Never trust strangers.”’ (46) 

7. I commonly wonder what hidden reason another 
person may have for doing something nice for 
me. (54) 

8. I used to think that most people told the truth 
but now I know otherwise. (62) 

9F. I have no enemies who really wish to harm me. (67) 

10F. I seldom feel that people are trying to anger or 
insult me. (72) 


Verbal: 


1. When I disapprove of my friends’ behavior, I let 
them know it. (7) 
2. I often find myself disagreeing with people. (15) 
3. I can't help getting into arguments when people 
disagree with me. (23) 
4. I demand that people respect my rights. (31) 
SF. Even when my anger is aroused, I don't use 
“strong language.”’ (39) 
6. If somebody annoys me, I am apt to tell him 
what I think of him. (43) 
7. When people yell at me, Iyell back. (47) 
8. When I get mad, I say nasty things. (51) 
9F. I could not put someone in his place, even if he 
needed it. (55) 
10. I often make threats I don’t really mean to carry 
out. (59) 
11. When arguing, I tend to raise my voice. (68) 
12F. I generally cover up my poor opinion of others. 
(63) 
13F. I would rather concede a point than get into an 
argument about it. (74) 





* The numbers in parentheses indicate the sequence of items in the mimeographed form of the inventory. 
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Table 1—Continued 





1. The few times I have cheated, I have suffered 
unbearable feelings of remorse. (8) 

2. I sometimes have bad thoughts which make me 
feel ashamed of myself. (16) 

3. People who shirk on the job must feel very guilty. 
(24) 

4. It depresses me that I did not do more for my 
parents. (32) 

5. Iam concerned about being forgiven for my sins. 
(40) 

6. I do many things that make me feel remorseful 
afterward. (48) 

7. Failure gives me a feeling of remorse. (56) 

8. When I do wrong, my conscience punishes me 
severely. (64) 

9. I often feel that I have not lived the right kind 
of life. (69) 





tailed distribution decreases the magnitude of 
a correlation coefficient, but it is possible to 
adjust for a difference in standard deviations 
(7, pp. 149-150). When Edwards’ correlation 
of .87 between social desirability and prob- 
ability of endorsement is adjusted to the pres- 
ent range of values, it becomes .74. There is 
still a large disparity between Edwards’ cor- 
rected correlation of .74 and the present ones 
of .27 and .30, and the curtailment of the 
range of social desirability evidently accounts 
for only a small part of the discrepancy. 
The second difference between the studies 
lies in the construction of the present inven- 
tory. The writers were aware that social de- 
sirability might influence inventory responses, 
and attempted to minimize its effect by: (a) 
assuming tha* anger was present and inquir- 
ing only how it is expressed; (5) providing 
justification for admitting aggressive acts; 
and (c) including cliches and idioms that 


would find ready acceptance. On the other 
hand, Edwards used a list of unelaborated 
personality trait names, and there was no at- 
tempt to manipulate the wording of the items. 
Thus, the present low correlations between 
social desirability and probability of endorse- 
ment would seem to reflect the success of the 
item construction techniques used in the pres- 
ent study. 

Previous attempts at controlling social de- 
sirability have taken two forms. The first is 
to develop suppressor variables like the “va- 
lidity” scales of the MMPI (6). The second 
approach is to scale items for social desir- 
ability and then use a paired comparisons 
type of inventory, in which each item is paired 
with another item of matched social desir- 
ability (3). The present study suggests a third 
approach, that of focusing on the process of 
item construction. Perhaps the influence of 
social desirability can be substantially reduced 
or eliminated at the source, i.e., in the actual 
wording of the item. 


Factor Analyses 


The final form of the inventory was ad- 
ministered in group fashion to 85 male and 
88 female college students. The eight scales 
were scored, and product-moment correlations 
were computed for men and women sepa- 
rately. The correlation matrices are presented 
in Tables 2 and 3. None of the women’s cor- 
relations, and only two of the men’s correla- 
tions, are above .50, which suggests that the 
various scales are tapping at least partially 
independent behaviors. Thurstone’s centroid 
method (10) was used to extract two factors 
from each intercorrelation matrix. The axes 


Table 2 
Table of Intercorrelations for Men (VN = 85) 








Indirect 





Verbal 
Variable Assault Hostility Irritability N-gativism Resentment Suspicion Hostility 
Indirect Ho .28 
Irritability 32 H 
Negativism 30 27 .20 
Resentment 16 33 44 31 
Suspicion ll 27 .26 38 58 
Verbal Ho 40 40 66 .25 37 21 
Guilt — .03 28 24 08 27 .25 16 
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Table 3 


Table of Intercorrelations for Women (V = 88) 











Indirect Verbal 
Variable Assault Hostility Irritability Negativism Resentment Suspicion Hostility 
Indirect Ho 38 
Irritability 30 31 
Negativism 27 34 .29 
Resentment 14 .23 30 .23 
Suspicion Al 19 30 15 AS 
Verbal Ho 37 19 44 30 .22 21 
Guilt — .07 05 16 01 33 .27 .10 





for men and women were rotated to the same 
simple structure so that the factor loadings of 
the two sexes would be comparable. These 
factor loadings are presented in Table 4. 

If only factor loadings of .40 and over are 
considered meaningful, the first factor is de- 
fined by Resentment and Suspicion for men, 
and by Resentment, Suspicion, and Guilt for 
women. The second factor is defined by As- 
sauli, Indirect Hostility, Irritability, and Ver- 
bal Hostility for both sexes, with the addi- 
tion of Negativism for women. However, both 
Guilt and Negativism had positive loadings 
on their respective factors for the men, also, 
and the sex differences just noted are slight. 
In fact, the men’s and women’s factor load- 
ings are generally similar, differences being 
small and random. Since the same axes were 
used for men and women, this similarity of 
factor loadings suggests that the factor struc- 
ture is stable. 

The two factors extracted from the inter- 
correlation matrix divide hostility into an 


Table 4 
Rotated Factor Loadings for Men and Women 














Men Women 

Variable I Il # > ua F 
Assault ad AO 19 41 2 
Indirect Hostility 19 40 37 00 48 .38 
Irritability 3... ae. Jae 14 47 44 
Negativism db 0426 ~4% BSB AM 
Resentment @.. 42:4 57 04 A5 
Suspicion .66 —.02 .60 54 02 45 
Verbal Hostility 05 63 .64 04 49 44 
Guilt 29 03 .14 50 .28 .33 








“emotional” or attitudinal component (‘‘Peo- 
ple are no damn good”) and a “motor” com- 
ponent that involves various aggressive behav- 
iors. However, it should be noted that the 
factor loadings are not high. The average 
communality of the eight variables was .43 
for men and .40 for women, leaving consider- 
ably more than half of the test variance un- 
explained. Some of this specific variance may 
be attributed to unreliability of the scales 
(especially since they are short), but there 
seems to be much variance that is stable and 
unique. 

The presence of unique variance is not sur- 
prising, since it seems likely that there are 
more than two components of hostility. For 
example, the second factor includes both As- 
sault and Verbal Hostility, yet there are ob- 
viously many verbally hostile individuals who 
are not assaultive. Similarly, with respect to 
the first factor, resentment may be seen in 
the absence of distrust and suspicion. The 
presence of unique variance would seem to 
reflect the presence of these patterns within 
each factor. 

The population used in deriving the two 
factors was normal, but the factors appear to 
have relevance for clinical populations. For 
example, the characteristics associated with 
paranoid personalities suggest that such indi- 
viduals would score high on Resentment and 
Suspicion (Factor I) and low on the other 
scales. On the other hand, hysterical person- 
alities should score low on Resentment and 
Suspicion and high on Irritability, Negativ- 
ism, and Verbal Hostility. In both instances, 
no prediction can be made concerning Assault, 
since this variable is thought to be related to 
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Table 5 


Means and Standard Deviations for College 
Men and Women 











Men Women 

———— No. ———— 

Variable Mean SD Items Mean SD 
Assault 5.07 2.48 10 3.27 2.31 
Indirect Hostility 447 2.23 9 5.17 1.96 
Irritability 5.94 2.65 11 6.14 2.78 
Negativism 2.19 1.34 5 2.30 1.20 
Resentment 2.26 1.89 8 1.78 1.62 
Suspicion 3.33 2.07 10 2.26 1.81 
Verbal Hostility 7.61 2.74 13 6.82 2.59 
Guilt 5.34 1.88 9 441 2.31 
Total Hostility 30.87 10.24 66 27.74 8.75 


the variables of sex, socioeconomic status, psy- 
chopathology, etc. 


Norms 


The collection of normative data for a new 
instrument is a long-time endeavor. In the 
present instance the process has just begun. 
Norms are being collected for clinical popula- 
tions, and the construct validity of the inven- 
tory is being investigated. At present, the only 
norms available are for the 85 college men 
and 88 college women who were administered 
the final form of the inventory. The means 
and standard deviations of these two groups 
are presented in Table 5. Since these samples 
are small and not representative, the norms 
must be regarded as highly tentative. 


Summary 


This paper described the construction of an 
inventory consisting of the following scales: 
Assault, Indirect Hostility, Irritability, Nega- 
tivism, Resentment, Suspicion, Verbal Hos- 
tility, and Guilt. The first and second versions 
of the scale were item analyzed, and the final 
revision consists of 75 items. 


The hostility items were scaled for social 
desirability, and social desirability was cor- 
related with probability of endorsement. The 
r’s of .27 and .30 for college men and women, 
respectively, were considerably smaller than 
those of previous studies. The reduction in the 
effects of social desirability was attributed to 
item-writing techniques. 

Factor analyses of college men’s and wom- 
en’s inventories revealed two factors: an atti- 
tudinal component of hostility (Resentment 
and Suspicion) and a “motor” component 
(Assault, Indirect Hostility, Irritability, and 
Verbal Hostility). The relevance of these fac- 
tors to the study of abnormal as well as nor- 
mal personalities was illustrated. 
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The Deliberate Use of a Set to “Fake” in 


Personality Questionnaires’ 


Marshall B. Jones 
U. S. Naval School of Aviation Medicine, Pensacola, Florida? 


Test-taking attitudes pose a central prob- 
lem in the measurement of personality. The 
forced-choice technique and special scales, 
like the K Scale of the MMPI, represent two 
familiar approaches to this problem. This 
note is concerned with still another approach 
which, though obvious enough, has received 
little attention. 

Some scales correlate very well with them- 
selves when administered under a set to 
“fake.” Specifically, the subjects are asked to 
respond to each item not only as they think 
they are but as they think they should be. 
The test key is then applied to both sets of 
responses. The correlation between the two re- 
sultant scores, self-descriptive (s-d) and ideal- 
descriptive (i-d), will often run as high as the 
mid-.50’s. Suppose, however, that a pool of 
i-d items were analyzed against an s-d scale 
which was already well correlated with its 
i-d parallel. Conceivably an i-d predictor scale 
could be developed which would correlate well 
enough with the s-d scale to serve as a second 
form. Were this possible, the problem of test- 
taking attitudes, at least for the scale at issue, 
would be solved; we could simply use the i-d 
scale. 

To test this possibility, the 197-item Pensa- 
cola Z Survey (1) was administered to two 
samples of 264 and 325 naval aviation cadets. 
The cadets were instructed to give both the 
s-d and the i-d response to each item. Four of 


1An extended report of this study may be ob- 
tained without charge from Marshall B. Jones, 418 
South Second Street, Pensacola, Florida, or for a fee 
from the American Documentation Institute. Order 
Document No. 5272, remitting $1.25 for microfilm or 
$1.25 for photocopies. 

2 The contents of this note are not to be construed 
as necessarily reflecting the view of the Navy De- 
partment. 
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Table 1 


Correlations of the Z Survey Scales with Their i-d 
Parallel and Predictor Scales 

















Sample 1 Sample 2 

Scales 1 2 3 4 
Heteronomy .70 .70 54 A6 
Dependency .65 .69 57 51 
Rigidity 50 62 AS A7 
Anxiety .07 57 30 27 


Hostility 46 62 A2 38 





the scales of the Z Survey are of the same 
length, 40 items, and have no item overlap 
with each other. The fifth, Heteronomy, has 
66 items, and overlaps all four of the other 
scales. In the first sample of 264 cadets, all 
197 i-d items were analyzed against each of 
the s-d scales. Any item which correlated at 
the .05 level with a given s-d scale was in- 
cluded in a corresponding i-d predictor scale. 
In Table 1 the correlations of the s-d scales 
with their i-d parallel scales (Columns 1 and 
3) and with their i-d predictor scales (Col- 
umns 2 and 4) appear. Upon cross validation 
the i-d predictors correlate no better, if not a 
little worse, than do the i-d parallel scales. 
The burden, therefore, of this note is nega- 
tive. The é-d parallel scales seem already to 
have absorbed so much of the variance avail- 
able to i-d items that no increase obtains from 
ordinary item-analytic procedures. 

Brief Report. 

Received March 21, 1957. 
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Differential Qualitative Performance of Delinquents 
on the Porteus Maze 


Gilbert Fooks 
Hartford Regional Technical High School 


and Ross R. Thomas 


Newington Home and Hospital, Connecticut 


Several studies (1, 4,6) have reported that 
the Qualitative (Q) score of the Revised 
Porteus Maze significantly differentiates de- 
linquents from nondelinquents. The present 
study was undertaken as a further cross 
validation of these results as well as to deter- 
mine the efficacy of the new Maze Extension 
in discriminating between normal and delin- 
quent adolescents. 


Subjects and Procedure 


Twenty-five girls and twenty-five boys from 
Connecticut institutions for delinquents were 
matched with nondelinquent high school sub- 
jects from the New London High School on 
the basis of age, sex, intelligence, and socio- 
economic level of parents. Table 1 contains 
the matching data for age and intelligence. 
The intelligence estimates available for the 
delinquent group were from the Wechsler- 
Bellevue and Stanford-Binet tests; the Otis 
Group Intelligence Test was the measure 
available for the nondelinquent group. Match- 
ing on the basis of parental occupation was 
less successful, the nondelinquent children 
representing a generally higher socioeconomic 
group. 

All of the delinquents were diagnosed as 
psychopaths or having psychopathic tenden- 
cies by the institution’s psychiatrists. The 
high school subjects were judged by the school 


1 The authors wish to express their appreciation to 
the psychiatric and social service staffs at the Long 
Lane School for Girls, and the Meriden School for 
Boys, and to the Principal and Deans at the New 
London High School. 
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administrators to be normal high school stu- 
dents with no record of antisocial behavior. 
Both groups were tested with the Revised 
Maze and the new Maze Extension, the for- 
mer being given first in all cases. Scoring 
on the Revised Maze was as prescribed in 
the manual (4) with the exception of “Lift 
Pencil” which, on the advice of Porteus,’ was 


Table 1 


Age and Intelligence of Experimenta! and 
Control Groups 














Delinquents Normals 
Measure Boys Girls Boys Girls 
N 25 25 25 25 
Mean Age 14.5 15.2 14.6 15.3 
AgeRange 13-162 13-169 14-164 13.1-16.1 
Mean IQ 96.4 95.4 97 90.3 


IQ Range 79-143 82-124 81-137 82-126 





penalized five points after five “Lift Pencils” 
on one test sheet. The Extension scoring was 
essentially the same with the suggestion by 
Porteus to score the “Wavy Line” error on the 
Extension by referring to the criteria available 
for the Revised Maze. 


Results and Discussion 


In Table 2 are given the results of the pres- 
ent study along with those of three previous 
studies. The Q score of both the Revised Maze 
and the Maze Extension significantly differ- 


2 Personal communication. 


Gilbert Fooks and Ross R. Thomas 





Table 2 


Q-Score Meaas, Sigmas, and Significance of Difference Between Delinquent and Normal Groups 


























Delinquents Normals 
Study N Mean SD N Mean SD ? 
Porteus (4) 100 49 100 22 13 .0001 
Porteus (4) 50 48 179 19 13 .0001 
Docter & Winder (1) 60 47 60 25 20 001 
Present (Revised Maze) 50 40 18.4 50 22.2 14.8 .001 
Present (Extension) 50 46.4 20.1 50 27.8 17.3 001 








entiated between the two groups, thus con- 
firming the results previously reported. No sex 
differences were found on the Q score indicat- 
ing that previous findings may be generalized 
to female groups. 

It was found that Porteus’ critical cutoff 
score of 29 correctly identified 76% of the 
delinquents, and wrongly identified only 28% 
of the controls. This cutoff score is based on 
the weighted scoring system devised by Por- 
teus. Docter and Winder (1), using a non- 
weighted system in addition to the Porteus 
system, concluded that the more time-con- 
suming weighted system was not as efficient 
as the nonweighted system. Application of 
their unweighted score cutoff of 16 to the 
present group correctly identified 74% of the 
delinquents and incorrectly identified 28% of 
the control group. It may be concluded, there- 
fore, that, at least with regard to distinguish- 
ing between delinquents and nondelinquents, 
the unweighted Q score is not only easier to 
determine but is nearly as efficient as the 
Porteus system. Table 3 shows rates of identi- 
fication using both cutoff scores on the Re- 
vised Maze and on the Maze Extension, to- 
gether with a summary of previous findings 
by other investigators. 

If further studies of the Maze Extension 
bear out our findings, it would appear that the 
use of the Extension for detection or diagno- 
sis of delinquency would depend on the rela- 
tive importance of a high true-positive rate 
vs. a high false-positive rate. 

In support of the nonweighted scoring sys- 
tem, Docter and Winder determined that only 
four of the eight error scores, when applied 
singly, differentiated delinquents from non- 
delinquents. In our study, only one individual 


error score, that of the incidence of “Wavy 
Lines,” discriminated significantly between the 
two groups. The average delinquent in our 
study obtained a score of 4.5 for this error as 
compared to an average of only 1.16 for the 
controls (probability of chi square less than 
.05). Thus, neither study indicates sufficient 
discrimination on the basis of individual error 
scores to warrant individual use of a single 
error score. 

In order to further determine whether in- 
tellectual ability per se had any effect on the 
Q score, the correlation of Porteus Quantita- 
tive (a measure of intelligence) and Q scores 
was computed. The resultant r’s are given in 
Table 4. Three of the four r’s are not signifi- 
cant and are lower than those reported by 
Porteus (4). Thus the Q score appears to be 


Table 3 


Percentages of Delinquents and Normals Above Cutoff 
Scores Determined by Weighted and 
Unweighted Scoring Systems 














Delinquents Normals 
Cut- &% Cut- &% 
Study off Above off Above 
Weighted 
Porteus (4) 29 80 29 21 
Wright (6) 29 78 _— _— 
Docter& Winder(1) 29 70 29 30 
Present (Revised) 29 76 29 28 
Present (Extension) 29 74 29 42 
Unweighted 
Docter & Winder(1) 16 not 16 not 
given given 
Present (Revised) 16 74 16 28 


Present (Extension) 16 84 16 54 


























Table 4 
Relationship of Quantitative Sccre to Q Score 














Delinquents Normals 
Study r p r p 
Porteus (4) —44 057 
Present (Revised) —.03 ns —.004 ns 
Present (Extension) —.03 ns —39 Ol 








generally independent of intellectual ability, 
and use of the Q score as an independent 
measure is supported. 

Further evidence on the reliability of scor- 
ing was obtained. A correlation of .98 was ob- 
tained for independent scorings of 50 of the 
completed tests by the authors. The same cor- 
relation has also been reported by Docter and 
Winder (1). 


Summary and Conclusions 


1. Groups of 50 delinquents and 50 non- 
delinquents were given the Revised Porteus 
Maze Test and Extension. Qualitative (Q) 
scores on both tests significantly differenti- 
ated between delinquents and nondelinquents 
(p < .001). 
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2. No sex differences were found on the Q 
score, indicating that previous results may be 
generalized to females. 

3. Results of the present study support the 
hypothesis that no significant relationship 
exists between intelligence, as estimated from 
the Porteus Quantitative score, and Q score. 

4. Evidence is reported which suggests that 
a nonweighted scoring system is nearly as 
efficient as the present weighted system of 
scoring. 

5. Interscorer reliability was found to be 
satisfactory for the Q score (r = .98). 


Received October 29, 1956. 
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Manifest Anxiety and Projective and Objective 
Measures of Need Achievement 


A. W. Bendig* 


University of Pittsburgh 


Two of the most popular psychometric 
measures of drives and/or motives are Tay- 
lor’s Manifest Anxiety Scale (4) and McClel- 
land’s projective need Achievement scale (2). 
Raphelson (3) has reported a nonsignificant 
correlation of — .25 between these scales for 
a small sample (NV = 25) of college Ss. No 
reports have appeared to date on the relation- 
ship between McClelland’s projective measure 
of “need Achievement” and the objective scale 
purportedly measuring “need Achievement” 
which is included in Edwards’ Personal Pref- 
erence Schedule (1). 

The above three scales were administered 
to 244 students (136 men and 108 women) 
enrolled in a course in introductory psychol- 
ogy.” The raw scores from the scales were 
converted to stanines on the basis of norms 
developed from other groups of Ss and the 
product-moment correlations among the scales 
computed for men and women Ss separately. 
The differences between the correlations from 
the two sex groups were found to be statisti- 
cally not significant. Consequently, the cor- 
relations were averaged by the usual r-to-z 
method. 

The correlation between the MAS and Mc- 
Clelland’s n-Ach scale was .06 while the cor- 


1An extended report of this study may be ob- 
tained without charge from A. W. Bendig, Dept. of 
Psychology, University of Pittsburgh, Pittsburgh 13, 
Pa., or for a fee from the American Documentation 
Institute. Order Document No. 5276, remitting $1.25 
for microfilm or $1.25 for photocopies. 

2 The author wishes to express his appreciation to 
Messrs. Jack Dunsing and Oakley Ray for their 
administration and scoring of the projective need 
achievement test. 


relation of the MAS and Edwards’ n-Ach 
scale was — .05. Neither of these two correla- 
tions approaches statistical significance (V = 
244). The correlation between the projective 
and objective measures of n-Ach was .11 
which is not significant at the .05 level, but 
is barely significant at the .10 level of con- 
fidence. 

The correlations of the three major vari- 
ables with a measure of verbal ability were 
also calculated. All three correlations were 
nonsignificant: — .01, .02, and .03. 

The results indicate that the three scales 
are measuring quite independent traits. The 
small correlation between the two measures 
of “need Achievement” suggests either that 
each scale measures a different type of “need 
Achievement,” or that the low stability reli- 
ability of McClelland’s scale results in con- 
siderable attenuation of the relationship. 


Brief Report. 
Received April 29, 1957. 
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The Effectiveness of the Bender-Gestalt 
in Differential Diagnosis’ 


Arthur S. Tamkin 
VA Hospital, Northampton, Massachusetts 


Since the introduction of the Visual Motor 
Gestalt test by Lauretta Bender (2) in 1938, 
there has arisen considerable interest in its 
use as a differentially diagnostic instrument 
for psychiatric disorders. While its efficacy in 
identifying cases of organic brain disease, in 
which there is a disintegration of visual motor 
functions, has been considered to rest on firm 
ground, the question of its applicability to 
the functional mental disorders has not been 
settled. Bender found deviations of visual mo- 
tor Gestalt patterns in her studies of schizo- 
phrenic children and adults, but because the 
personality disturbances of psychoneurotics 
seldom invade their visual motor sphere, she 
did not find their records to show deviations. 
Hutt (6), however, was able to delineate 
characteristic distortions in the Bender draw- 
ings of schizophrenics and psychoneurotics 
which distinguished these clinical groups from 
each other and from patients with organic 
brain damage. Billingslea (3) found no sup- 
port for Hutt’s proposed psychoneurotic signs, 
nor was Hanvick (5) able to differentiate be- 
tween psychoneurotics with functional back- 
ache and control patients with proven organic 
disease of the back. 

With the introduction of an objective scor- 
ing method by Pascal and Suttell (8) suc- 
cessful separation between clinical groups with 
functional diagnoses has been reported. In ad- 
dition to Pascal and Suttell, Lonstein (7) and 
Bowland and Deabler (4), all using the same 


1 From the Veterans Administration, Northampton, 
Massachusetts. The author wishes to thank the fol- 
lowing members of the Clinical Psychology Service 
for their constructive review of the manuscript: Drs. 
Isidor Scherer, Arnold Trehub, Cesareo D. Pefia, and 
C. James Klett. 
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scoring method, reported discrimination be- 
tween hospitalized psychotics and nonpsy- 
chotic psychiatric patients at high levels of 
statistical significance. Because of Pascal and 
Suttell’s findings that age did not materially 
affect scoring levels within the age range of 
15 to 50 years, controls for age were ignored 
in these studies. Furthermore, the factors of 
education and chronicity were not uniformly 
controlled by these experimenters. The pres- 
ent study was an attempt to cross validate the 
findings that the Pascal and Suttell scoring 
method of the Bender-Gestalt differentiates 
significantly between functionally psychotic 
patients and those with nonpsychotic, func- 
tional mental disorders. 


Procedure 


The subjects (Ss) used in this study con- 
sisted of a group of 27 psychotics and a group 
of 27 neurotics and personality disorders 
matched on the basis of age. They were all 
male patients at the Veterans Administration 
Hospital, Northampton, Massachusetts, who 
had taken the Bender-Gestalt in conjunction 
with other psychological tests for routine psy- 
chodiagnostic evaluations. Except for a few, 
they had been tested shortly after their ar- 
rival as new admissions or readmissions, and 
their diagnoses, representing functional psy- 
chiatric disorders, were established later by 
neuropsychiatric staff conferences. All Ss had 
sufficient education to permit the computation 
of Pascal and Suttell’s z score; that is, at 
least one year of high school. The psychotic 
group ranged in age from 20 to 42 years, with 
a mean age of 30.85, and the nonpsychotic 
group ranged in age from 21 to 43 years, with 
a mean age of 31.63. Thus, the Ss were rep- 
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resentative of the high school and college edu- 
cated hospitalized patients with functional 
mental disorders of recent exacerbation who 
might require differential diagnoses. 

Each S’s Bender-Gestalt protocol was scored 
by the Pascal and Suttell method without the 
scorer’s knowledge of the diagnosis, and the 
corresponding z score for the S’s educational 
level was determined. For 37 Ss of these sam- 
ples, scores on the F and Critical Item scales 
of the MMPI were obtained. These two scales 
have been shown to be related to degree of 
psychopathology when applied to similar pa- 
tients (9), and they were used as indices of 
psychopathology. 


Results 


The correlation coefficient between age and 
z score was found to be + .29, which is sig- 
nificant at the .05 level. The mean z score of 
the group of psychotics was 59.19, and of the 
group of neurotics and personality disorders 
was 61.59. A ¢ test of the difference between 
the means yielded a value of 0.58, indicating 
no significant difference. Since the weights of 
the reproductive errors derived by Pascal and 
Suttell may not have been applicable to the 
samples used in this study, the numbers of 
raw errors produced by each group were com- 
pared. The mean number of raw errors of the 
psychotic group was 7.44, and for the group 
of neurotics and personality disorders it was 
9.44. A ¢ of 1.74, however, was not signifi- 
cant at the .05 level. Since MMPI profiles of 
37 Ss were available, an attempt was made to 
determine if z scores were correlated with test 
measures of psychopathology, if not with psy- 
chiatric diagnosis. Scores of both F and Criti- 
cal Item scales seemed to be suitable criteria 
of degree of psychopathology since they each 
differentiated the two clinical groups at the 
O5 level, based upon one-tailed hypotheses. 
Accordingly, correlation coefficients were com- 
puted for z and F and for z and the Critical 
Item scale. The obtained values of + .29 and 
+ .17 were not significant. 


Discussion 
The failure to find significant differences be- 


tween the Bender-Gestalt scores of these two 
clinical groups, when the important extrane- 
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ous variables of age, education, and chronicity 
were controlled, contrasts sharply with the 
positive findings reported by other investi- 
gators using the Pascal and Suttell scoring 
method. In comparing the scoring levels of 
this psychotic group, one finds lower raw 
scores and z scores than those reported by 
the other investigators, while for the nonpsy- 
chotic group the scores are more nearly simi- 
lar (1, 4, 7, 8). It may be of some significance 
in explaining the divergent findings of this 
study that none except Addington controlled 
for the age variable, nor did they uniformly 
control for education or chronicity. Adding- 
ton, who selected subjects with at least two 
years of hospitalization, obtained the highest 
raw score in his schizophrenic group. 

In conclusion, it appears that the Bender- 
Gestalt, scored by the Pascal and Suttell 
method, is of dubious effectiveness in differ- 
entiating between functional psychiatric dis- 
orders. In this study, it failed to separate hos- 
pitalized psychiatric patients with functional 
psychoses from hospitalized neurotics and per- 
sonality disorders, nor did it correlate signifi- 
cantly with MMPI-derived indices of psy- 
chopathology. 


Summary 


The effectiveness of the Bender-Gestalt, 
scored by the Pascal and Suttell method, in 
differentiating the functional mental disor- 
ders, was investigated. The z scores were com- 
puted from the Bender-Gesialt protocols of a 
group of 27 functional psychotics and a group 
of 27 neurotics and personality disorders 
matched on the basis of age. All Ss were se- 
lected from newly admitted or readmitted 
hospital patients who had at least ninth grade 
education. The findings showed no significant 
differences between the two clinical groups 
and no significant correlations between z 
scores and two MMPI-derived indices of 
psychopathology. A significant correlation be- 
tween age and z score was obtained, contrary 
to the findings of Pascal and Suttell. It was 
concluded that the Bender-Gestalt, scored by 
the Pascal and Suttell method, has dubious 
effectiveness as a differentially diagnostic in- 
strument for the functional mental disorders. 


Received October 16, 1956. 
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Rorschach Animal Responses and Intelligence 


Robert Sommer 
Southeast Louisiana Hospital 


Some writers have alluded to a negative 
relationship between the number of animal 
responses on the Rorschach and intelligence 
(1), but the evidence is equivocal as is the 
case with animal movement responses (3). 
The present paper summarizes the results 
from a comparison of Rorschach A, A%, and 
FM with Wechsler-Bellevue verbal scale score 
for 123 psychiatric patients at Southeast 
Louisiana Hospital, the total number of cases 
in the Psychology Department files for whom 
both records were available. The population 
consisted of 77 males and 46 females from all 
diagnostic categories. The Wechsler-Bellevue 
verbal scale was used as the IQ criterion, as 
it was felt that this would be less affected by 
conditions of depression, organicity, or se- 
nility than would total scale score. 

The Pearson coefficient between A and IQ 
is .27 + .08. However, it should be noted that 
previous research has found a relationship be- 
tween A and R (4) and between R and IQ 
(2). In the present study, these coefficients 
were .87 and .29, respectively. Hence, it 
seemed advisable to run a partial correlation 
between A and IQ, holding the effects of R 
constant. When this was done, the correlation 
between A and IQ was found to be .04. 

The other index of A responses is the A% 
which automatically takes R into account but, 
like most ratios, tends to be somewhat un- 
stable. The correlation between A% and IQ 
is .02. 


The correlation between FM and IQ proved 
to be .34 + .08. When the correlations be- 
tween FM and R (.68), and between R and 
IQ (.29) were taken into account, the partial 
correlation between FM and IQ was found to 
be .20 + .09 which is significant at the .05 
level and supports the results found by 
Tucker. 


Summary 


When the number of responses given by the 
subject was taken into account, there was no 
over-all relationship between the number of 
animal responses and Wechsler-Bellevue ver- 
bal IQ for a psychiatric population. However. 
there was a small but statistically significant 
positive relationship between animal move- 
ment responses and IQ. 


Received November 30, 1956. 
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Crawford, John E., & Dorothea M. Crawford Small 
Parts Dexterity Test. Manual (rev. 1956), pp. 12. 
New York: Psychological Corp., 1956. 


The Small Parts Dexterity Test is a performance 
test of fine eye-hand coordination, used for person- 
nel selection and guidance. It involves tweezer dex- 
terity with small pins and collars, and starting and 
tightening small screws with a screwdriver. The re- 
vised manual gives percentile norms based on some 
thousands of employees, applicants, and students. 
There are data on reliabilities and on the relation- 
ships of the test to industrial criteria —L. F. S. 


Gough, Harrison G. California Psychological Inven- 
tory. High school-college-adult. 1 form. Untimed 
(45-60) min. Test booklet ($6.25 per 25, $21.75 per 
100) ; hand-scoring or IBM answer sheet, and pro- 
file ($3.75 per 50, $16.50 per 250) ; hand-scoring or 
IBM stencils ($3.00 per set); sample set ($1.00) ; 
manual, pp. 40 ($3.00). Palo Alto, Calif.: Consult- 
ing Psychologists Press, 1956, 1957. 


The California Psychological Inventory (CPI) bears 
considerable resemblance to the group MMPI, from 
which about 200 of its 468 items were adapted. But 
the purpose of the CPI is quite different. It is in- 
tended primarily for use with normal subjects, not 
patients, and strives to assess personality character- 
istics important for social living. The 18 scales now 
available are divided into four groups: I. Measures 
of poise, ascendancy, and self-assurance (Do, domi- 
nance; Cs, capacity for status; Sy, sociability; Sp, 
social presence ; Sa, self-acceptance; Wb, sense of well- 








being). II. Measures of socialization, maturity, and 
responsibility (Re, responsibility; So, socialization; 
Sc, self-control; To, tolerance; Gi, good impression; 
Cm, communality). III. Measures of achievement po- 
tential and intellectual efficiency (Ac, achievement 
via conformance; Ai, achievement via independence; 
Ie, intellectual efficiency). IV. Measures of intellec- 
tual and interest modes (Py, psychological-minded- 
ness; Fx, flexibility; Fe, feminity). 

All but four of the scales were developed by item 
analyses against external criteria. So, for example, 
discriminates between delinquents and nondelinquents, 
and also has relationships with adjectival descriptions 
of normal persons. Four scales, Sp, Sa, Sc, and Fz, 
were formulated by content and refined by item 
analyses for internal consistency. The manual con- 
tains a wealth of information on the validities of 
scales and on the interpretations of single scales, in- 
teractions, and profiles. The matrices of scale inter- 
correlations, based on 4,098 males and 5,083 females, 
show relationships which vary from zero to the 
seventies. Retest reliabilities after one to three weeks 
cluster around .80; after a year, from 48 to .75. The 
problem of dissimulation is discussed extensively, and 
three scales, Gi, Wb, and Cm are shown to have spe- 
cial value in assessing it. Standard score norms are 
based on over 6,000 cases for males and 7,000 for fe- 
males; some 50,000 cases have already contributed to 
research~on the instrument. The manual gives much 
other research data, and cites 54 references. 

By both objective and subjective evaluation, the 
CPI appears to be a major achievement. It will surely 
receive wide use for research and for practical appli- 
cations —L. F. S. 
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